Re: Should this be done in Bison or by post-processing of the AST?

Hans-Peter Diettrich <DrDiettrich1@aol.com>
Fri, 09 Nov 2007 09:47:15 +0100

          From comp.compilers

Related articles
Should this be done in Bison or by post-processing of the AST? drewpvogel@gmail.com (eyeris) (2007-11-08)
Re: Should this be done in Bison or by post-processing of the AST? DrDiettrich1@aol.com (Hans-Peter Diettrich) (2007-11-09)
Re: Should this be done in Bison or by post-processing of the AST? jo@durchholz.org (Joachim Durchholz) (2007-11-09)
Re: Should this be done in Bison or by post-processing of the AST? drewpvogel@gmail.com (eyeris) (2007-11-12)
| List of all articles for this month |

From: Hans-Peter Diettrich <DrDiettrich1@aol.com>
Newsgroups: comp.compilers
Date: Fri, 09 Nov 2007 09:47:15 +0100
Organization: Compilers Central
References: 07-11-028
Keywords: parse, AST
Posted-Date: 10 Nov 2007 16:35:08 EST

eyeris wrote:


> I am writing a parser for a rather unconventional language. It is used
> to process text, with commands enclosed in braces. The if construct is
> a command, rather than a syntactical construct. Here is an example:
>
>
>>Answer: I'm going to the [if month eq <12>]ski lodge[else]park[endif].


Why not a more XML/XSLT style syntax?


> My parser deals with this example just fine. However this language has
> a [endif all] command that closes all open [if] blocks. Here is
> another example:


I vaguely remember a "super bracket", perhaps used in (some) Lisp?


>>Answer: [if show_answer eq <1>]I'm going to the [if month eq <12>]ski lodge[else]park[endif all].
>
>
> In my current implementation, the resulting AST for the second example
> would look like this:
>
> text (Answer: )
> if (show_answer eq <1>)
> text (I'm going to the )
> if (month eq <12>)
> text (ski lodge)
> else
> text (park)
> text (.)


I don't see an essential special case, different from "traditional"
handling of conditional statements or expressions, so far.




> However the text block containing the period *should* be a child of
> the root text node (containing the text Answer:). Since this is a
> logical issue and not really a parsing issue, it seems like the
> easiest way to remedy this is by creating a new node type for the
> [endif all] command and then manually expand those nodes into normal
> [endif] nodes, moving the siblings following the [endif all] node out,
> making them sibling of the [endif all] node's parent after the parsing
> is complete.


The alternative: no nodes for [endif] at all. Wouldn't this be much simpler?


> However this also seems like a problem that many people have solved
> before, probably finding clever tricks in the process. Is there an
> easy way to deal with this right in the Bison grammar file?


Your language IMO is not context free, most of the CFG patterns, parser
strategies etc. won't be applicable, including Bison.


Perhaps it would be possible to distinguish between an outer-if, which
can contain any number of inner-if parts, where a [endif] closes the
actual (nearest) inner or outer if, the [endif all] closes the
outmost-if only.


outer-if ::= "[if" ... [ inner-if ] ( "[endif]" | "[endif all]" ).
inner-if ::= "[if" ... [ "[endif]" ]


This might be a CFG, as long as no other constructs come into the way.


DoDi


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.