Python 3 Text Processing with NLTK 3 CookbookThis book is intended for Python programmers interested in learning how to do natural language processing. Maybe you’ve learned the limits of regular expressions the hard way, or you’ve realized that human language cannot be deterministically parsed like a computer language. Perhaps you have more text than you know what to do with, and need automated ways to analyze and structure that text. This Cookbook will show you how to train and use statistical language models to process text in ways that are practically impossible with standard programming tools. A basic knowledge of Python and the basic text processing concepts is expected. Some experience with regular expressions will also be helpful. |
What people are saying - Write a review
We haven't found any reviews in the usual places.
Other editions - View all
Common terms and phrases
accuracy AffixTagger algorithm antonyms BeautifulSoup bigrams binary classifiers Brill tagger chunk corpus chunker ChunkString corpora corpus reader corpus view Creating custom corpus dateutil DecisionTreeClassifier def __init__(self default DefaultTagger dictionary doit evaluating execnet Extracting feature detector feature extraction feature sets fmeasure FreqDist function gateway Getting ready high information words Howit hypernym inthe IOB tags itworks keyword argument label lemmas lxml MaxentClassifier method module movie_reviews Naive Bayes classifier NaiveBayes NaiveBayesClassifier class named entity ngram NLTK nltk.corpus import NLTKTrainer noun NumPy parse parse trees partofspeech tags patterns phrase pickled precision and recall previous recipe Python Redis regular expressions replacement scikitlearn score SequentialBackoffTagger stopwords subclass subtrees Synsets TagChunker tagged sentence tagger with backoff tagset test_feats Text Classification There's timezone todoit tokenized sentence Tokenizing Text train_chunker.py train_classifier.py train_tagger.py tree Tree('NP Tree('S treebank treebank corpus treebank_chunk tuples unigram UnigramTagger verb WordNet YAML