Re: Is C++ LL(k)?

"Mike Dimmick" <mike@dimmick.demon.co.uk>
27 Jul 2001 02:55:10 -0400

          From comp.compilers

Related articles
Re: Is C++ LL(k)? lingu@cs.pku.edu.cn (Lin Gu) (2001-07-23)
Re: Is C++ LL(k)? kaz@ashi.footprints.net (2001-07-27)
Re: Is C++ LL(k)? mike@dimmick.demon.co.uk (Mike Dimmick) (2001-07-27)
Re: Is C++ LL(k)? isaac@latveria.castledoom.org (2001-08-06)
| List of all articles for this month |
From: "Mike Dimmick" <mike@dimmick.demon.co.uk>
Newsgroups: comp.lang.c++,comp.compilers
Date: 27 Jul 2001 02:55:10 -0400
Organization: Compilers Central
References: <9jh5uu$olr$1@sunlight.pku.edu.cn> <mSZ67.20253$zb.360918@news1.rdc1.bc.home.com> 01-07-138
Keywords: C++, parse
Posted-Date: 27 Jul 2001 02:55:09 EDT

"Lin Gu" <lingu@cs.pku.edu.cn> wrote in message
news:01-07-138@comp.compilers...
> Thanks for the clear elumination.
>
> Your example also reminds me the 'if...if...else' ambiguation. Surely it is
> not LR(k).
>
> However, I may have to use an LL(k) compiler generator (Antlr) to
> write a compiler for it. Is it difficult? It is expected that I need
> to add some rules to disambiguate, but I want to know whether this is
> feasible.


If it's not LR(k), it can't possibly be LL(k).


My own experiences, with PCCTS (ANTLR v1.x - the whole tool set is now
incorporated into one command-line tool with ANTLR 2.x), suggest that C++ is
not LL(k). I don't think it's LR(k) either. I may be wrong - I don't have
an LR(k) tool to test with. I've been considering building one. I was
personally attempting to get it down to LL(1) but this was before I
understood exactly how PCCTS k>1 parsing works. Don't try, you'll just
exhaust yourself for little visible benefit. My feeling is that you need at
least LL(3), although the largest lookahead I tried was 2 tokens.


For example, the steering logic around the nested-name-specifier suggests
that at least three, possibly more, tokens of lookahead are necessary.
However, the main problem with C++ is that user-defined types are used as
keywords - this is perfectly intellegible to a human, but means you need to
disambiguate identifiers in some contexts. You can do this in PCCTS with
semantic predicates, but you still need to navigate the symbol table at
parse time - yuck.


There is a moderately successful C++ parser based around a rewritten version
of PCCTS by John Lilley; you can find links to it on
http://www.polhode.com/pccts.html.


Now, if you only need a cut-down C++ (no nested classes, namespaces), then
you can probably use ANTLR. The NeXT-derived grammar at polhode.com is
probably sufficient for this.


HTH,


--
Mike Dimmick


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.