|Re: Programming language and IDE design email@example.com (Martin Ward) (2013-11-16)|
|Re: Programming language and IDE design firstname.lastname@example.org (George Neuner) (2013-11-18)|
|Re: Programming language and IDE design email@example.com (2014-03-02)|
|Re: LL vs LR, was Programming language and IDE design firstname.lastname@example.org (Ivan Godard) (2014-03-01)|
|Re: LL vs LR, was Programming language and IDE design email@example.com (Kaz Kylheku) (2014-03-02)|
|From:||Kaz Kylheku <firstname.lastname@example.org>|
|Date:||Sun, 2 Mar 2014 18:41:00 +0000 (UTC)|
|Organization:||Aioe.org NNTP Server|
|References:||13-11-016 13-11-018 14-03-003 14-03-004|
|Keywords:||C, parse, tools|
|Posted-Date:||03 Mar 2014 06:26:49 EST|
On 2014-03-02, Ivan Godard <email@example.com> wrote:
> On 3/1/2014 4:03 PM, Hannah wrote:
>> So the C grammar is not LL(k) for any k, either. You can probably parse
>> that particular part of the C/C++ grammar easily using LR.
> The standard way to handle this, and many such other cases where a
> formal grammar is not LL, is to introduce a function (in recursive
> descent) or rule (in parser generators) that recognizes "parenthesized
> thingy" and then figure it out later. Yes, that amounts to LR, but you
> only need it in a few specialized places. There's an equivalent trick
> in "DOI=1," in Fortran, where you let the lexer resolve it so the
> parser sees proper lexemes, not characters.
You can parse "anything" with LL(k) with backtracking. What you do is
you try various different parses, in a carefully designed order, until
something matches, with backtracking.
> [C++ is unusually bad, with some parts just plain ambiguous. In the example
> a few messages back, I gather that the rule is that if you can parse it as a
> declaration, it's a declaration, otherwise it's something else. -John]
Years ago (1999) I wrote a parser for C expression syntax (only the expression
syntax) as part of a module which could diagnose, at run-time, unsafe uses of
a C macro that contains multiple evaluations:
The source code is here:
E.g. dumb example:
#define INCR(A) (SFX_CHECK(A) = (A) + 1)
This SFX_CHECK (actually KAZLIB_SFX_CHECK, for hygiene) expands to code which
produces the value of (A), but also hands the expression "A" to a run-time
checker which answers the question, does the C expression "A" have side
The answer to the question is "yes, no, or possibly".
(Also, the expression "A" is put into a hash table which caches the
result for next time, so the checking is faster, when turned on.)
If the answer is other than no, then a diagnostic is printed, with the
expression "A", and file name and line number where that macro was
expanded. The diagnostic tells you whether there is a confirmed problem
in that line or just suspicion.
The parser has way to know how a symbol is declared, and so it uses
In fact, an exception library based on setjmp/longjmp is used for some
backtracking returns. There is logic like, "if parsing it this way throws a
syntax error, try it this way".
For instance, what is "(a)(b)"? This could be that function a is being
called with argument b. Or it could be that a is a type, and b is being
cast to that type.
I remember parts of this being somewhat tough to write, but it looks fairly
transparent to me today. That was about a year before I discovered Lisp.
Return to the
Search the comp.compilers archives again.