|Re: Top-down deterministic parser? email@example.com (Alexander Morou) (2009-07-11)|
|RE: Top-down deterministic parser? firstname.lastname@example.org (Quinn Tyler Jackson) (2009-07-12)|
|Re: Top-down deterministic parser? email@example.com (Alexander Morou) (2009-07-13)|
|From:||"Quinn Tyler Jackson" <firstname.lastname@example.org>|
|Date:||Sun, 12 Jul 2009 14:12:19 -0700|
|Posted-Date:||13 Jul 2009 06:54:19 EDT|
Alexander Morou said:
> I'm looking for an example of a top-down left most derivative parser,
> which has an unlimited look-ahead capacity, doesn't generate tons
> of spaghetti look-ahead logic, and determines the logic inside the
> resultant parser at parse-time.
> Presently the program I'm writing encodes the DFA of a rule as
> mentioned before, I'm also encoding a generalized first DFA
> which utilizes the rule DFA as a source, which will finally
> be merged to declare the Follow DFA. Each subsequent phase
> introduces non-determinism, so the solution is to determinize it.
> I'll know more about whether this is sufficient once I'm there, but
> in the mean time it probably wouldn't hurt to look at other
> implementations, which deduce logic at the parser's run-time.
Because Meta-S parsing is adaptive, grammars are entirely compiled
into the parsing logic at parse-time. (Or rather, at both grammar
link-time and parse-time, where link-time occurs before a parse, but
is at run-time).
Those productions which are not adaptive can be entirely linked before
a parse, and their look-aheads are determined at this time, and this
percolates through the entire grammar. Explicit use of the #relink
directive in a grammar can also cause look-ahead calculation to be
reprocessed in the middle of a parse (for instance, if the grammar
adapts itself based on the input it has encountered, it may be
desirable for all of the look-aheads to be recalculated).
This is entirely implemented without "spaghetti". Meta-S is not a
table-driven parser, nor is it a code-generator. The A-BNF compiler
compiles A-BNF at run-time into an intermediate parsing language
called LPM*, and after a linking and optimization pass, LPM is
interpreted at run-time by the parsing engine. Each LPM rule consists
of one or more "clauses" (what a virtual machine might call a "byte
code") and each clause can have a "hint" (look-ahead thunk). Because
of this generalization, there is no pasta involved -- each clause can
call its look-ahead thunk and change its behavior accordingly when
invoked by the parsing engine.
By "percolation" I mean that look-aheads can be coalesced during the
linking process. If rule A can be a B or C, rule A is not entered if
neither B or C could possibly match.
Because the system is Turing Powerful, certain constructs cannot
generate look-ahead thunks, and in these cases the engine behaves in
the usual unlimited-look-ahead fashion. But because this behavior is
constrained to known-cases (and because of other optimizations too
numerous to mention here), the entire system often behaves in a
uni-directional linear fashion. In fact, due to some particular
optimizations, certain non-trivial cases can perform in a sublinear
fashion. (Doubling the input less than doubles the processing time.)
Although much of this is described in one way or another in _Adapting
to Babel_, some of the optimizations are new, undocumented, and/or
trade secrets, but that's a general summary of one approach to the
problem you have mentioned.
Quinn Tyler Jackson
Thothic Technology Partners, LLC
*LPM = Language for Pattern Matching
Return to the
Search the comp.compilers archives again.