Information Retrieval: Implementing and Evaluating Search Engines
MIT Press, Jul 23, 2010 - Computers - 606 pages
Information retrieval is the foundation for modern search engines. This text offers an introduction to the core topics underlying modern search technologies, including algorithms, data structures, indexing, retrieval, and evaluation. The emphasis is on implementation and experimentation; each chapter includes exercises and suggestions for student projects. Wumpus--a multiuser open-source information-retrieval system developed by one of the authors and available online--provides model implementations and a basis for student work. The modular structure of the book allows instructors to use it in a variety of graduate-level courses, including courses taught from a database systems perspective, traditional information retrieval courses with a focus on IR theory, and courses covering the basics of Web retrieval. After an introduction to the basics of information retrieval, the text covers three major topic areas--indexing, retrieval, and evaluation--in self-contained parts. The final part of the book draws on and extends the general material in the earlier parts, treating such specific applications as parallel search engines, Web search, and XML retrieval. End-of-chapter references point to further reading; exercises range from pencil and paper problems to substantial programming projects. In addition to its classroom use, Information Retrieval will be a valuable reference for professionals in computer science, computer engineering, and software engineering.
What people are saying - Write a review
We haven't found any reviews in the usual places.
ACM SIGIR Conference algorithm Annual International ACM arithmetic coding average batch binary bits Boolean bytes cache Chapter classifier codeword compression compute Conference on Research contains data structure decoding Development in Information dictionary disk distribution docid effectiveness efficient elements encoding engine’s Equation estimate evaluation Exercise feedback Figure ft,d function Golomb coding GOV2 hash table Huffman code implementation in-memory index construction index partitions Information Retrieval International ACM SIGIR interval inverted index language model latency logistic regression MapReduce measure methods node occurrences on-disk optimal p-value PageRank performance position postings list probability Proceedings pruning query processing query terms random relevant documents Research and Development schema-independent score search engine search results Section sequence Shakespeare SIGIR Table term frequency text collection Text REtrieval Conference tokens topics TREC update vector vector space model Web graph XPath