Re: Bottom-up versus Top-down

Henry Spencer <>
30 Nov 1997 22:53:02 -0500

          From comp.compilers

Related articles
Bottom-up versus Top-down (Jack Olsen) (1997-11-23)
Re: Bottom-up versus Top-down (Scott Stanchfield) (1997-11-24)
Re: Bottom-up versus Top-down (Gregory Bond) (1997-11-28)
Re: Bottom-up versus Top-down (1997-11-29)
Re: Bottom-up versus Top-down (1997-11-30)
Re: Bottom-up versus Top-down (Henry Spencer) (1997-11-30)
Re: Bottom-up versus Top-down (Rodney M. Bates) (1997-12-02)
Re: Bottom-up versus Top-down (Henry Spencer) (1997-12-02)
Re: Bottom-up versus Top-down (Scott Stanchfield) (1997-12-02)
Re: Bottom-up versus Top-down (1997-12-02)
Re: Bottom-up versus Top-down (1997-12-05)
Re: Bottom-up versus Top-down (1997-12-05)
[6 later articles]
| List of all articles for this month |

From: Henry Spencer <>
Newsgroups: comp.compilers
Date: 30 Nov 1997 22:53:02 -0500
Organization: SP Systems, Toronto
References: 97-11-123 97-11-155
Keywords: parse

George C. Lindauer <> wrote:
>...Top down parsing is much easier to implement, but,
>bottom-up parsing is more efficient. YACC and other code generators
>tend to generate the type of state tables required to do bottom-up
>parsing efficiently...

Exactly the same thing can be done for top-down parsing, although it's
less commonly seen. There is no fundamental reason for there to be
any efficiency difference.

The real dichotomy between the two is that top-down is less powerful
in terms of the grammars it can handle, but more powerful in terms of
what it can do in the way of supporting semantics etc. The reason is
simply that top-down always knows the context of the current tokens --
what the higher-level constructs around them are -- while bottom-up
discovers this afterward.

Top-down is restricted to grammars where it is possible to determine
the context beforehand, where the nature of a construct can be
determined by inspecting its beginning. For example, top-down with
the usual one token of lookahead has difficulty with some aspects of C
expression syntax, where "(" may introduce a parenthesized
subexpression [a + (b * c)] or a unary conversion [a + (unsigned) b],
and a top-down parser has to decide which way to go before seeing what
follows. (The usual fix for this is to cheat slightly, adding a bit
more lookahead to see whether the next token is part of a type name or

On the other hand, because top-down knows the context at all times, it
can exploit that information. For example, the operand of the C
"sizeof" operator can be an expression, but it is not evaluated --
only the type of its result matters -- and a top-down parser can
switch off code generation while parsing the operand. Bottom-up
parsers often have to postpone work, building intermediate data
structures to remember information that upper levels may want to use,
because they don't know the context well enough to do anything with it
immediately. (Of course, this may not look like a big penalty if the
architecture of a multi-pass compiler dictates that those structures
be built anyway.)

My own feeling, for what it's worth, is that top-down parsers are
generally underrated, but that the two types are good at different
things. Bottom-up's greater parsing power makes it the method of
choice for doing experimental work -- e.g., tinkering with a new
notation -- or for dealing with pre-existing languages with difficult
syntax. Top-down's limited power increases the investment needed to
make it work, but gives more leverage for semantics once it does, so
it can be a better tool for "production" work with a stable and
well-behaved input language. And in real life, the choice can be
dictated one way or the other by available tools: bottom-up tools are
more common, but if circumstances dictate doing without any
parser-generation tools at all, top-down is much easier to hand-code.
| Henry Spencer

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.