Words to Bytes: May 2011

one percent more

"Many areas in NLP are like this. You can get 92% accuracy in a few hours of work, and then you can get 93% after a week or work, and then you can write a whole PhD thesis about how you got 94% accuracy."

Farsi pos tagger

Finally, I have a Farsi pos tagger. I trained a unigram, bigram and TnT tagger on 2 million tagged words of BijanKhan corpus. check the funny part: I trained them on all my data, without splitting it to train and test. now must re-train them again. I just looked at the results, were promising. so after re-training and evaluating, I will post the result. If my classmate, Vahid, helps, would deploy it on a web server.

Pages

one percent more

Farsi pos tagger