Toponym Resolution in Text: Annotation, Evaluation and Applications of Spatial Grounding of Place Names

Universal-Publishers, 2008 - Počet stran: 292

The problem of automatic toponym resolution, or computing the mapping from occurrences of names for places as found in a text to an unambiguous spatial footprint of the location referred to, such as a geographic latitude/longitude centroid is difficult to automate due to insufficient and error-prone geographic databases, and a large degree of place name ambiguity: common words need to be distinguished from proper names (geo/non-geo ambiguity), and the mapping between names and locations is ambiguous (London can refer to the capital of the UK or to London, Ontario, Canada, or to about forty other Londons on earth). This thesis investigates how referentially ambiguous spatial named entities can be grounded, or resolved, with respect to an extensional coordinate model robustly on open-domain news text by collecting a repertoire of linguistic heuristics and extra-linguistic knowledge sources such as population. I then investigate how to combine these sources of evidence to obtain a superior method. Noise effects introduced by the named entity tagging that toponym resolution relies on are also studied. While few attempts have been made to solve toponym resolution, these were either not evaluated, or evaluation was done by manual inspection of system output instead of creating a re-usable reference corpus. A systematic comparison leads to an inventory of heuristics and other sources of evidence. In order to carry out a comparative evaluation procedure, an evaluation resource is required, so a reference gazetteer and an associated novel reference corpus with human-labelled referent annotation were created for this thesis, to be used to benchmark a selection of the reconstructed algorithms and a novel re-combination of the heuristics catalogued in the inventory. Performance of the same resolution algorithms is compared under different conditions, namely applying it to the output of human named entity annotation and automatic annotation using an existing Maximum Entropy sequence tagging model.

Prohlédnout si tuto knihu »

Vybrané stránky

Strana 24

Obsah

Rejstřík

Obsah

Introduction	23

3	34

Background	41

Previous and Related Work	77

Centroidbased	86

Evaluation	92

Dataset	112

Document Annotation	125

5	180

Applications	187

77	205

Summary and Conclusion	215

A Notational Conventions	221

TRCoNLL Sample Used in Prose Only Evaluation	223

G Stories Used in the Visualization Study	239

Bibliography	261

TRCONLL	135

Methods	145

Autorská práva

Běžně se vyskytující výrazy a sousloví

Algorithm annotation applications assign baseline Berlin Cambridge CITY candidate referents centroid Chapter Clough Computational Linguistics computed CoNLL contains context coordinates dataset default referents described digital library disambiguation distance document example F-SCORE feature type Figure gazetteer lookup geo-coding Geographic Information Retrieval Geographic References global gold standard heuristics I-NP implemented information retrieval latitude/longitude Leidner lines linguistic London machine learning markup MaxEnt MAXPOP mentioned metrics named entity recognition named entity tagger named entity tagging natural language processing NERC number of toponym outperforms performance PERSEUS place names polygon population Pouliquen precision processing PROVINCE query query expansion Rauch recall referent per discourse reported resolved toponym RMSD score Smith and Crane spatial t₁ TextGIS textual thesis tokens toponym instances toponym recognition toponym resolution toponym resolution task TR-CoNLL TR-MUC4 TRML unambiguous toponyms United States COUNTRY weight Word Sense Disambiguation YAROWSKY

Bibliografické údaje

Název	Toponym Resolution in Text: Annotation, Evaluation and Applications of Spatial Grounding of Place Names
Autor	Jochen L. Leidner
Vydavatel	Universal-Publishers, 2008
ISBN	1581123841, 9781581123845
Délka	Počet stran: 292

Exportovat citaci	BiBTeX EndNote RefMan

O službě Knihy Google - Zásady ochrany soukromí - Smluvní podmínky - Informace pro vydavatele - Nahlásit problém - Nápověda - Domovská stránka Google