Discourse Processing here is framed as marking up a text with structural descriptions on several levels, which can serve to support many language-processing or text-mining tasks. We first explore some ways of assigning structure on the document level: the logical document structure as determined by the layout of the text, its genre-specific content structure, and its breakdown into topical segments. Then the focus moves to phenomena of local coherence. We introduce the problem of coreference and look at methods for building chains of coreferring entities in the text. Next, the notion of coherence relation is introduced as the second important factor of local coherence. We study the role of connectives and other means of signaling such relations in text, and then return to the level of larger textual units, where tree or graph structures can be ascribed by recursively assigning coherence relations. Taken together, these descriptions can inform text summarization, information extraction, discourse-aware sentiment analysis, question answering, and the like. Table of Contents: Introduction / Large Discourse Units and Topics / Coreference Resolution / Small Discourse Units and Coherence Relations / Summary: Text Structure on Multiple Interacting Levels
What people are saying - Write a review
We haven't found any reviews in the usual places.
adjacent algorithm analysis anaphora resolution anaphoric annotation antecedent candidate approach Aravind Joshi Arg1 arguments Association for Computational automatically Barzilay boundaries Chapter Cited on page(s classifier coherence relations coherence-relational structure cohesion Computational Linguistics ComputationalLinguistics Conference conjunction connectives constraints content zones contrast COREF coreference resolution corpora corpus cue words decision definite NPs defNP disambiguation discourse parsing discourse processing discourse referent discourse structure discourse units discussed EDUs embedded entities evaluation example Figure genre graph Haiti Hearst identified instance lexical chains logical document structure machine learning Manfred Stede Mann and Thompson mentioned meronymy Natural Language Processing nodes noun phrase ofthe overall pairs paragraph parser PDTB Poesio problem Proc pronoun reader referring expressions relative clauses role RST-DT Section semantic sentence sequence signals similar Soricut and Marcu spans syntactic syntax trees task text segmentation text summarization topic segmentation Treebank types unicorn verb WordNet