Re: Parser question

Mikael =?ISO-8859-1?Q?Str=F6m?= <>
Thu, 17 Jul 2008 01:48:09 +0800

          From comp.compilers

Related articles
Parser question (Mike \(News\)) (2008-07-11)
Re: Parser question (Aeran) (2008-07-13)
Re: Parser question (Max Hailperin) (2008-07-13)
Re: Parser question (Hans-Peter Diettrich) (2008-07-15)
Re: Parser question (Mikael =?ISO-8859-1?Q?Str=F6m?=) (2008-07-17)
Re: Parser question (Max Hailperin) (2008-07-19)
| List of all articles for this month |

From: Mikael =?ISO-8859-1?Q?Str=F6m?= <>
Newsgroups: comp.compilers
Date: Thu, 17 Jul 2008 01:48:09 +0800
Organization: Compilers Central
References: 08-07-024 08-07-029
Keywords: parse
Posted-Date: 19 Jul 2008 14:44:39 EDT

Thanks Max for your highly enlightening answer. I really learned a lot
from your explanation. The dragon book is quite abstract, and it's not
that easy to see how things could fit together.

> I told gcc to produce assembly language output rather
> than binary object code, and I compared the two versions. The result:
> essentially no difference.

Interesting, never thought about that. I'm using GCC at this time, but
the idea is to (one day) let the compiler compile itself so i can
bootstrap to other platforms. I try to use a very basic subset of C, and
want to see how far I can get. The compiler has distinctive front and
back ends. The front end generates three address code. The back end (a
separate executable) generates x86 assembly for the time being. Even
though the grammar is _very_ limited at this time, all phases are
implemented and actually works. The only optimization is evaluation of
constant expressions and peephole optimization, so the produced code is
not really in the GCC class :) The thing is that i only after a few
weeks, on spare (=night) time, actually managed to write a C-like
compiler that works. What i learned from earlier attempts is to make ALL
phases of the compiler from start, even though the language is extremely
simple and the produced code is terrible.

> I think you goofed on the productions for moreterms. The N5 is surely
> supposed to be an epsilon...

Yes, "N5" was epsilon when i mailed it... And yes, i goofed.

> expr():
> left = term()
> // initially fall into moreterms
> repeat until exited: // this loop is moreterms
> if lookahead = '+':
> advance lookahead
> right = term()
> newLeft = makenode_op('+', left, right)
> left = newLeft
> // loop back for moreterms
> else if lookahead = '-':
> advance lookahead
> right = term()
> newLeft = makenode_op('-', left, right)
> left = newLeft
> // loop back for moreterms
> else if lookahead can legally follow:
> return left
> else:
> report a syntax error

The above pseudo code makes me a bit wondering though: "return left"
would return a leaf, and not the 'root' of the expression, correct?

I try to figure out how to return the root rather than the leaf, but no
matter how i try, the result is a very cluttered code.

So, if i understand the above code correctly (and made the right
assumption on left vs root returned) then the recursive version has
several advantages; readability, speed and code size. The only drawback
is that the stack needs to be pretty big and/or some mechanism for
tracking the stack to avoid stack overflow (and a certain crash) must be

What is the most common solution in C compilers and similar?

Again, thanks a lot!


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.