PHRASE parsers from Multi-Axiom Grammars
by Teodor Rus, James S. Jones
Abstract
Multi-axiom grammars (MAG) are alternatives to the
single-axiom context free grammars (CFG) and all-axiom
algebraic grammars (AG) for programming language specification.
Neither phrase recognition nor algebraic mechanisms for language
processing are supported by CFGs. AGs support algebraic mechanisms
for language processing but specify a smaller class of languages.
MAGs avoid these limitations. This paper describes a new parsing
algorithm
developed on this basis which recognizes any phrase in the language.
Moreover, it does so by distributing the parsing task among a
collection of smaller parsers which handle well-defined layers of the
language in a piping manner. These language-layers are determined by
the algebraic properties of the MAGs and are described in the paper.
Basic definitions are given for multi-axiom grammar and language
as well as for algebraic notions of subgrammar,
primitive subgrammar, quotient grammar, and grammar/language layer.
Algorithms are described to stratify a programming
language into a hierarchy of layers, to construct
parsers for each layer analogous to LR construction, and
to accomplish the overall task of multi-layered parsing
in pipeline fashion based on a tokenization which
occurs between the language layers. This pipeline parallel
process is a model for high speed, left-to-right
language translation.
Keywords : Grammar; Subgrammar; Pipeline parsing; Tokenize; Phrase
Return to Jim Jones's faculty homepage