Related articles |
---|
Grammar -> Parser question chris63@my-dejanews.com (1998-06-04) |
Re: Grammar -> Parser question td@sysinno.se (Torbjorn Drevin) (1998-06-09) |
Re: Grammar -> Parser question torbenm@diku.dk (Torben Mogensen) (1998-06-09) |
Re: Grammar -> Parser question dwight@pentasoft.com (1998-06-09) |
Re: Grammar -> Parser question qjackson@wave.home.com (Quinn Tyler Jackson) (1998-06-09) |
Re: Grammar -> Parser question qjackson@wave.home.com (Quinn Tyler Jackson) (1998-06-18) |
From: | "Torbjorn Drevin" <td@sysinno.se> |
Newsgroups: | comp.compilers |
Date: | 9 Jun 1998 11:58:55 -0400 |
Organization: | Algonet/Tninet |
References: | 98-06-018 |
Keywords: | parse, LL(1) |
Hi,
Maybe this example grammer can help you:
expr ::= <term> [<addop> <term>]*
term ::= <factor> [ <mulop> <factor> ]*
factor ::= <literal> | (<expr>) | <variable>
It may not suite your purpose, but have a look at the grammer for factor.
The factor can be a <expr> in LPARENT and RPARENT. This is where the
recursive thing begins. Now have a look at a grammer for condition:
cond ::= expr relop expr
relop ::= for example: != | >= | <= | == ....
A common way to handcode the parser is to make functions that correspond to
the BNF grammer. A function is done from each grammer, something like:
// Parse > cond ::= expr relop expr
void do_cond()
{
do_expr();
Match(relop);
do_expr();
}
Note that it make calls to the expression function.
// Parse <expr> ::= <term> [<addop> <term>]*
void do_expr()
{
do_term();
while ( isaddop(get_token()) )
do_term();
}
Do you get the point ?
The do_term function then make a call to do_factor, and do_factor make a
call back to do_term(), if LPARENT is found. This can now handle expressions
like:
(((((((a*124)*12)*(1243 .....
It does not matter how complex the expression may look like.
One can need some "look ahead" sometimes. You should make a function
something like: char* get_token()
The function get_token() always have the next token stored.
If the source to parse look like:
A B
A call to get_token() return "A". But, the smart get_token() have also
stored the next token at char* ahead. if one look at whats in ahead, it
should be "B". This is a solution for this.
What you need, is to have a look at Jack Crenshaw's "Lets build a compiler".
He deals exactly with problems like this, and he explain it very good.
/Torbjorn Drevin doc.drevin@swipnet.se / td@sysinno.se
chris63@my-dejanews.com wrote...
>Suppose I have the following Grammar:
>
>condition -> expr PLUS expr |
> LPAREN condition RPAREN
>
>expr -> ID |
> LPAREN expr RPAREN
>
>I want to hand-code a top down recursive descent parser for this grammar
>using ideas from the Dragon Book (nonterminals become functions, terminals
>become calls to match()). I can't use lex or yacc.
>
>However, the LPAREN symbol is causing a problem: When I see it, I don't
>know if I'm dealing with a nested condition or a nested expression.
>
>How does one typically deal with this situation if one has to hand-code
>the parser? It looks like I need some sort of lookahead and/or backtracking
>mechanism so I can figure out which case I am dealing with.
--
Return to the
comp.compilers page.
Search the
comp.compilers archives again.