A Theory of IndexingThis book presents a theory of indexing capable of ranking index terms, or subject identifiers in decreasing order of importance. This leads to the choice of good document representations, and also accounts for the role of phrases and of thesaurus classes in the indexing process. This study is typical of theoretical work in automatic information organization and retrieval, in that concepts are used from mathematics, computer science, and linguistics. A complete theory of information retrieval may emerge from an appropriate combination of these three disciplines. |
Other editions - View all
A Theory of Indexing, Issues 18-22 Gerard Salton,Cornell University. Department of Computer Science No preview available - 1975 |
Common terms and phrases
assigned automatic binary weights centroid corresponding CRAN collection CUT and MULT CUT IDF CUT discrimination value docu document collection document frequency ranges document vectors experimental frequency distributions frequency factor high-frequency terms IDF CUT IDF included index terms index vectors indexing theory indexing vocabulary information value inverse document frequency keywords low frequency terms medium frequency methods multiplications nondiscriminators nonrelevant number of documents number of terms obtained occur output of Table pairs and triples parameters percent performance poor discriminators poor S/N produce rank recall-precision results relevant retrieval S/N terms signal-noise single term specific SPT phrases standard binary standard term frequency Statistical significance output t-test Table 17 Table 24 term deletion term frequency weighting term occurrences term pairs term significance measures term values term weights terms exhibiting TF weights thesaurus classes three test collections total number transformation user queries weighting function weighting system Wilcoxon word stem