Natural Language Processing for Online Applications: Text retrieval, extraction and categorization. Second revised edition
This text covers the technologies of document retrieval, information extraction, and text categorization in a way which highlights commonalities in terms of both general principles and practical concerns. It assumes some mathematical background on the part of the reader, but the chapters typically begin with a non-mathematical account of the key issues. Current research topics are covered only to the extent that they are informing current applications; detailed coverage of longer term research and more theoretical treatments should be sought elsewhere. There are many pointers at the ends of the chapters that the reader can follow to explore the literature. However, the book does maintain a strong emphasis on evaluation in every chapter both in terms of methodology and the results of controlled experimentation.
Other editions - View all
ACM Press ACM SIGIR Conference algorithm analysis anaphora annotation Annual International ACM applications approach Artiﬁcial Intelligence assigned automatic Boolean Chapter classiﬁers clusters collection combination Computational Linguistics Conference on Research contain context coreference court deﬁned Development in Information document retrieval evaluation example FASTUS ﬁeld ﬁltering ﬁnal ﬁnd ﬁnding ﬁnite ﬁrst frequency given grammar identiﬁed indexing information extraction Information Retrieval International ACM SIGIR International Conference language models Machine Learning match Message Understanding Conference methods named entity Named entity recognition Natural Language Processing noun groups noun phrase occur parse parser patterns performance probabilistic probability problem Proceedings pronoun query expansion query term ranking regular expressions relevant documents Research and Development rules scores search engine Section semantic sentence Sidebar signiﬁcant speciﬁc statistical structure summary syntactic Table tagged task Technology template text categorization text classiﬁcation text mining Text REtrieval Conference Token topic TREC typically words