Generalized LR Parsing
Springer Science & Business Media, Aug 31, 1991 - Computers - 166 pages
The Generalized LR parsing algorithm (some call it "Tomita's algorithm") was originally developed in 1985 as a part of my Ph.D thesis at Carnegie Mellon University. When I was a graduate student at CMU, I tried to build a couple of natural language systems based on existing parsing methods. Their parsing speed, however, always bothered me. I sometimes wondered whether it was ever possible to build a natural language parser that could parse reasonably long sentences in a reasonable time without help from large mainframe machines. At the same time, I was always amazed by the speed of programming language compilers, because they can parse very long sentences (i.e., programs) very quickly even on workstations. There are two reasons. First, programming languages are considerably simpler than natural languages. And secondly, they have very efficient parsing methods, most notably LR. The LR parsing algorithm first precompiles a grammar into an LR parsing table, and at the actual parsing time, it performs shift-reduce parsing guided deterministically by the parsing table. So, the key to the LR efficiency is the grammar precompilation; something that had never been tried for natural languages in 1985. Of course, there was a good reason why LR had never been applied for natural languages; it was simply impossible. If your context-free grammar is sufficiently more complex than programming languages, its LR parsing table will have multiple actions, and deterministic parsing will be no longer possible.
Other editions - View all
action process action table active edge analysis bottom-up CALL chart parsers complete edges context-free grammar created cyclic grammars Earley Earley's algorithm entries error count error objects example executed EXIT goto table grammar type graph-structured stack Hidden Markov Model implementation input sentence input string input symbols input tree interpretation LALR lexical lookahead LR parser LR parsing algorithm LR parsing table machine translation merged modified algorithm natural language nonterminal packed shared parallel parse graph parse stack parse tree parsing actions parsing algorithm parsing strategy partial parses path performance PGLR phoneme prior probabilities probabilistic rank diff reduce action reduce operation reduce process result rule-body procedure rules running threshold score function sdec searching space Section semantic sentence length sequence sequential truncation shift action speech recognition stack node subtree syntactic score terminal symbols Tomita's algorithm top-down filtering truncation algorithm truncation mechanism word