Pages

NLTK

The bad news is that

Although Python 3.0 is now available, many packages that NLTK requires do not have distributions for Python 3.0.

and python 3 is the only version with much less unicode problems. so is NLTK_lite the solution or must wait for new NLTK release?

Text processing 000:

1. Reads a text, split it to sentences, words and count number of sentences and words.
2. Find the word frequency for each of text's words.
3. Find the largest word in text.
4. Find reverse of a word.
i cdnuolt blveiee taht I cluod aulaclty uesdnatnrd waht I was rdanieg. The phaonmneal pweor of the hmuan mnid, aoccdrnig to a rscheearch at Cmabrigde Uinervtisy, it dseno’t mtaetr in waht oerdr the ltteres in a wrod are, the olny iproamtnt tihng is taht the frsit and lsat ltteer be in the rghit pclae. The rset can be a taotl mses and you can sitll raed it whotuit a pboerlm. Tihs is bcuseae the huamn mnid deos not raed ervey lteter by istlef, but the wrod as a wlohe. Azanmig

from http://www.emergingcl.com/