Re: Relaxed pattern recognition

haberg@matematik.su.se (Hans Aberg)
13 Dec 2003 20:59:14 -0500

          From comp.compilers

Related articles
Relaxed pattern recognition mars@realsoftware.com (2003-12-08)
Re: Relaxed pattern recognition Ralf.Laemmel@cwi.nl (Ralf Laemmel) (2003-12-13)
Re: Relaxed pattern recognition haberg@matematik.su.se (2003-12-13)
Re: Relaxed pattern recognition cdodd@acm.org (Chris Dodd) (2003-12-13)
| List of all articles for this month |
From: haberg@matematik.su.se (Hans Aberg)
Newsgroups: comp.compilers
Date: 13 Dec 2003 20:59:14 -0500
Organization: Mathematics
References: 03-12-072
Keywords: parse
Posted-Date: 13 Dec 2003 20:59:14 EST

mars@realsoftware.com (Mars Saxman) wrote:


>I would like to write a parser using bison that deals with fragments
>of source code that may or may not be complete. I would like to
>construct an AST from the source code using whatever parts exist,
>resolving them into more specific structures whenever they match some
>pattern.
...
>All of the bison (or yacc) examples I have seen expect to parse
>correct code, and the behaviour when they find a syntax error is
>simply to print a message and bail out. I'd like to pull as much
>correct information as I can out of the token stream and return the
>rest as more primitive productions.


>Is this sort of thing possible with bison? Are there examples
>available? Am I just overlooking something obvious?


I think that you are out of luck with any standard version of Bison or
any such parser generator.


I got a paper from "Susan L Graham" <graham@cs.berkeley.edu> on more
advanced error recovery techniques. Perhaps they have some
experimental version of Bison with more advanced error recovery
techniques. But if you hope to find a program, "doing as much as
possible" of the parsing, I think that is a research level topic. You
might try an Internet search or a search at <http://www.berkeley.edu>
for the words "Bison" or "Yacc".


As for error recovery with Bison, Yacc and other parser generators
that use LALR(1), one should note that in when an error token is
discovered, the parser may not immediately detect that, but may
execute some actions (even though no more token will be
shifted). Therefore, if very exact error recovery is needed, one
should settle for LR(1) instead, I think. Bison does not currently
support LR(1). It has though a GLR parser, which might be used to
split the parser if the input source is ambiguous (see the Bison
manual for examples).


    Hans Aberg


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.