Multi-Layered Pipeline Parsing of Phrases from Multi-Axiom Grammars

by James S. Jones

Thesis for Ph.D. in Computer Science
The University of Iowa
May 1997


Abstract

Conventional approaches for specifying and processing programming languages have taken a two-layered approach. The division between these layers has been based on differences in broad classes of languages and machines which have encourged the use of different methods for language definition and parsing. As a result, many-layered approaches which also support the processing of phrase components of a language are not well represented, certainly not those for left-to-right parsing. A multi-layer, unified approach for language specification and parsing is presented which is based on the multi-axiom grammar (MAG) and properties associated with it. These are defined and it is shown why the MAG is superior to the single-axiom context free grammar (CFG) and the all-axiom algebraic grammar (AG) for specifying a language and all of its layers (superior with regard to the definable classes of languages and phrase definition). An algorithm is described to stratify a programming language generated by a MAG into a hierarchy of layers, each being a sublanguage of phrases generated by a subgrammar. The layering process is based on a notion of primitivity such that a phrase at a given layer is composed of primitive elements and (sub)phrases from other layers. The PHRASE parsing method is defined which operates for a given language layer and tokenizes any phrase at that layer which is embedded in the input stream. It is a multi-axiom extension of LR parsing, named for its actions: Pass, Halt, Reduce, Accept, Shift, and Error. Simplified-MLR (SMLR) and fully general Multi-axiom LR (MLR) implementations of PHRASE are described. It is shown how this collection of layered parsers are used in a multi-pass fashion to parse an entire input stream based on a multi-layered tokenization process, and furthermore how this can be pipelined parallelized for high speed, left-to-right language translation.


UMI Dissertation Services
UMI Number: 9731814
300 North Zeeb Road
Ann Arbor, MI 48103


Return to Jim Jones's faculty homepage