|"Wildcard" part of an LALR(1) grammar? email@example.com (2004-05-16)|
|From:||firstname.lastname@example.org (Magnus Lie Hetland)|
|Date:||16 May 2004 23:36:30 -0400|
|Organization:||Norwegian university of science and technology|
|Keywords:||parse, question, comment|
|Posted-Date:||16 May 2004 23:36:30 EDT|
I'm trying to implement a system which can parse input containing
segments of the form
<anything that doesn't match any rule> <any rule>
I've been thinking about doing an LL(1)-like thing, and analyze the
grammar to find a single token that will determine when any legal rule
begins (or *may* begin), thus terminating the wildcard'ish segment.
However: I'm using Bison, and I'd like to capitalize on its LALR(1)
parsing capabilities, if that is possible.
So, I guess the question is: Is it possible?
It seems to me that it might not be -- before you can start working on
the rule, you have to reduce the wildcard segment, and you only get a
single lookahead to decide whether or not to reduce it, so...
Am I missing something? Is there any practical parsing algorithm that
will actually deal with this sort of thing? (An Earley parser, for
Actually, the LL(1)'ish solution of finding all "stop tokens" for the
wildcard run would be acceptable, I think (although not ideal) -- the
main problem then would lie in analyzing the grammar to find these
stop tokens. (I can't really assume anything about the grammar, as I
get that from the user.) Dealing with empty rules and the like makes
it non-trivial -- or so it seems to me.
Magnus Lie Hetland "Wake up!" - Rage Against The Machine
http://hetland.org "Shut up!" - Linkin Park
[Ratfor did more or less what you're asking, and its parser is written
in yacc. As I recall, it has a couple of generic productions that
are lists of words that it uses to skip over the parts that it doesn't
understand. It helped a lot that each ratfor statement starts with a
keyword and the statement boundaries were line boundaries so they were
easy to find, too. -John]
Return to the
Search the comp.compilers archives again.