|LRSTAR: LR(*) parser generator|
|Home Downloads Feedback Comparison History Theory Papers Doc. Contact|
An LR(*) parser generator does an LR(1) grammar analysis and creates an LR(1) parser, which has extra information, if necessary, for doing lookahead parsing in those states that are ambiguous. An LR(*) parser can lookahead for 2, 3, 4 or more input tokens. This is defined by the user by setting k (e.g. /k=8). This lookahead operation is called nondeterministic parsing. LR(*) parsers may be slightly larger than minimal LR(1) parsers, but they are more powerful.
The "typedef" declaration in the C and C++ languages is a context sensitive issue. This cannot be solved by upgrading from LALR(1) to LR(1) or LR(k). It requires making use of a symbol table while parsing and this allows even an LALR(1) parser to handle this context sensitive issue. The LRSTAR parser generator has this feature built into the grammar notation, making life easy for a user.
DFA Lexical Analyzers
A DFA lexer is a finite-state automata without a pushdown stack (i.e. not a PDA). DFA is the fastest recognition algorithm (5 times the speed of PDA's). A DFA algorithm does not use any lookahead characters. Because it is not a PDA, it cannot handle "nested" comments or any type of "nested" input (e.g. [[a]]). However, DFA's work fine for most programming languages. Afterall, the job of handling "nested" types of input usually belongs to the parser. Before DFA's became available, lexical analysis had been known to consume a large percentage of the processing time in compiler front-ends. With DFA's that time is reduced to about 10% of the compiler front-end time.
A test found that parsers created by LRSTAR+DFA are 10% faster than the speed of parsers created by Bison+flex, when using LRSTAR's optimize-for-speed option (/o) . This test measured lexing and parsing speed, without creating a symbol table and AST. When not using the /o option, parsing speed is nearly identical for both software products. Note, the /o option does chain-reduction elimination in the parser tables. FYI, the parsing speed increased by 46% (up to 3,660,000 lines/sec) after rewriting the grammar in some places and adding disambiguating rules. When you do symbol-table and AST creation the speed comes down 33% to 2,400,000 lines/sec.
|(c) Copyright Paul B Mann 2018. All rights reserved.|