introduction to BNF, concrete syntax trees

"markww" <markww@gmail.com>
25 Sep 2006 01:16:28 -0400

          From comp.compilers

Related articles
introduction to BNF, concrete syntax trees markww@gmail.com (markww) (2006-09-25)
Re: introduction to BNF, concrete syntax trees mailbox@dmitry-kazakov.de (Dmitry A. Kazakov) (2006-09-25)
| List of all articles for this month |

From: "markww" <markww@gmail.com>
Newsgroups: comp.compilers
Date: 25 Sep 2006 01:16:28 -0400
Organization: Compilers Central
Keywords: parse, question
Posted-Date: 25 Sep 2006 01:16:28 EDT

Hi,

I'm just starting to look at BNF. I have to take a simple expression
like:

          (A + B * C) * D

and create a concrete syntax tree of it. Before I start writing my
application to parse the above, I need to understand how to do this on
a piece of paper. My first thought is to tokenize (by whitespace) every
component of the statement, then label its type. I have 3 types to
choose from:

f = factor
t = term
e = expression

with the following rules:

expression ::= exp "+" exp |
                                            exp "-" exp |
                                            term

term ::= term "*" factor |
                          term "\" factor |
                          factor

factor ::= number |
                            identifier |
                            "(" expression ")"

so it must be possible to keep looping through the tokens, merging
adjacent pairs that follow the above rules until I end up with one node
right? (more than one node would indicate an error in the expression?).

So my first pass would look like this:

                          f
                          |
----------------------------------
| |
| f e f t f | t f
| | | | | | | | |
( A + B * C ) * D

now I have to pass through again to try merging nodes I suppose. So the
first thing to look at is if I can merge any adjacent terms with
factors which would give me:

                          f
                          |
----------------------------------
| t | t
| | \ | | \
| f e f t f | t f
| | | | | | | | |
( A + B * C ) * D

pass through again I guess and try merging adjacent factors and terms
again. I am a bit confused here though. The A is just sitting there
unable to merge with anyone. But it can simply itself down to a term
right? So do I just now label it a term or leave it alone?

                              ------------ t ---
                            / \
                          f \
                          | \
---------------------------------- |
| t | |
| / | | |
| / t | t
| / | \ | | \
| f e f t f | t f
| | | | | | | | |
( A + B * C ) * D

Well I guess I just keep looping until there are no factors and terms
to merge. Then I can try merging adjacent terms and expressions:

                              ------------ t ---
                            / \
                          f \
                          | \
---------------------------------- |
| | |
| - e - | |
| / \ | |
| / t | |
| | / | | |
| | / t | t
| | / | \ | | \
| f e f t f | t f
| | | | | | | | |
( A + B * C ) * D

So again I'm not sure what to do with factors like the A that just have
been sitting there this whole time. Do I just keep simplying themselves
if possible from f -> t -> e?

Any advice would be great. A simple walk through of an expression like
A + B * C would be great.

Thanks,
Mark


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.