|LRStar: LR(*) parser generator|
|Home Downloads Feedback Comparison Theory Papers Documentation Contact|
An LR(*) parser generator creates a minimal LR(1) parser, which has extra information in ambiguous states for doing lookahead parsing. An LR(*) parser can lookahead for 2, 3, 4 or more input tokens. This is defined by the user by setting k (e.g. /k=8). LR(*) parsers will be larger than minimal LR(1) parsers, if the grammar is not LR(1). How much larger? It depends on the number of conflict states.
The "typedef" declaration in the C and C++ languages is a context sensitive issue. This cannot be solved by upgrading from LALR(1) to LR(1) or LR(k). It requires making use of a symbol table while parsing and this allows even an LALR(1) parser to handle this context sensitive issue. LRStar has this feature built into the grammar notation, making life easy for a user.
DFA Lexical Analyzers
A DFA lexer is a finite-state automata without a pushdown stack (i.e. not a PDA). DFA is the fastest recognition algorithm (5 times the speed of PDA's). A DFA algorithm does not use any lookahead characters. Because it is not a PDA, it cannot handle "nested" comments or any type of "nested" input (e.g. [[a]]). However, DFA's work fine for most programming languages. Afterall, the job of handling "nested" types of input usually belongs to the parser.
When using a C99 grammar, I discovered that the C99 parser created by LRStar+DFA was 17% faster than the C99 parser created by Bison+Flex. For this test I used LRStar's /o option, which means optimize for speed. When not using the /o option, the LRStar parser was 2% faster. This test measured lexing and parsing speed, without creating a symbol table and AST. Note, the /o option does chain-reduction elimination, which is not available with Bison. Parser size is nearly the same for both products.
|(c) Copyright Paul B Mann 2018. All rights reserved.|