Re: concrete vs. abstract syntax trees

Chris F Clark <>
Sat, 14 Feb 2009 16:12:13 -0500

          From comp.compilers

Related articles
concrete vs. abstract syntax trees (eliben) (2009-02-13)
Re: concrete vs. abstract syntax trees (Hans Aberg) (2009-02-14)
Re: concrete vs. abstract syntax trees (Ira Baxter) (2009-02-14)
Re: concrete vs. abstract syntax trees (Chris F Clark) (2009-02-14)
Re: concrete vs. abstract syntax trees (Chris F Clark) (2009-02-14)
| List of all articles for this month |

From: Chris F Clark <>
Newsgroups: comp.compilers
Date: Sat, 14 Feb 2009 16:12:13 -0500
Organization: The World Public Access UNIX, Brookline, MA
References: 09-02-050
Keywords: AST
Posted-Date: 14 Feb 2009 16:56:59 EST

eliben <> writes:

> Is this correct to say that if I semi-mechanically translate the BNF
> of a language to a yacc grammar, what I'll get from parsing is a
> concrete syntax tree?

Yes, a concrete syntax tree is intended to have a direct (1-1)
relationship with the grammar. The concrete part of the name is
describing that relationship.

> If so, this tree is almost useless for any kind
> of interesting interaction (especially for languages like C where some
> things like types don't fit hierarchically in a way that resembles the
> concrete syntax). Is this right?

Concrete systax trees tend to over-promote certain artifacts of the
grammar in the shape of the tree. If the grammar is left-recursive,
the trees tower one way. A right recrusive grammar (for the same
language) would have trees that tower in the opposite direction. An
ELR grammar, using regular expressions, might not have a tree at all
but rather a flat list.

> Also, do compilers more commonly go through the concrete-syntax stage
> or do they usually build the AST directly in the parser?

You need the concrete syntax in the parser itself. However, does it
ever exist as a memory (data structure) image other than on the stack
of the parser? I think generally not except when the user is lazy in
a particular way. It's generally actually easier to build an AST that
captures only the information one needs (e.g. parameters go in a list,
not a tree but expressions make trees) than it was to make a CST
shaped one and work on that. The CST format only occurs when one
can't conveniently (or is too lazy to think about the downstream uses)
build a correctly shaped representation. For example, it is easier
sometimes to build a list like the parameter list backwards--that's a
CST type of choice and may be acceptable in some places. However,
with more effort, one can build the parameter list in "correct" order,
if that is important to do so.

Hope this helps,

Chris Clark Internet:
Compiler Resources, Inc. or:
23 Bailey Rd Web Site:
Berlin, MA 01503 voice: (508) 435-5016
USA fax: (978) 838-0263 (24 hours)

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.