From: | "Arthur J. O'Dwyer" <ajo@andrew.cmu.edu> |
Newsgroups: | comp.compilers |
Date: | 29 Jul 2006 14:06:22 -0400 |
Organization: | Carnegie Mellon, Pittsburgh, PA |
References: | 06-07-059 06-07-065 06-07-071 06-07-083 06-07-095 |
Keywords: | parse |
Posted-Date: | 29 Jul 2006 14:06:22 EDT |
On Fri, 28 Jul 2006, Hans-Peter Diettrich wrote:
> Arthur J. O'Dwyer schrieb:
>
>>>> # 2. Most languages are ambiguous at the syntax level, so that implicit
>>>> # disambiguation (longest match...) or explicit semantical constraints
>>>> # must be introduced. (see: dangling else...).
>>>>
>>>> Only poorly designed programming languages are ambiguous. (Natural
>>>> languages are ambiguous and don't use deterministic language theory.)
>>> Counterexample:
>>> The argument list of a subroutine is nothing but a list, which can be
>>> expressed using either left or right recursion in a grammar.
>>
>> That's a true statement, but I don't see what it's a "counterexample"
>> to.
>
> With regards to "poorly designed programming languages".
(This has been better addressed elsewhere in this thread; but when
(SM Ryan?) said "Only poorly designed programming languages are
ambiguous," he was probably talking about /inherently ambiguous/
languages. Comma-separated lists can be parsed by at least two different
/unambiguous/ grammars, so they're not an example of an inherently
ambiguous language.)
> After all I agree, that a language with a stricter definition leaves
> less room for different opinions about the implementation. Okay?
Yes, I think we agree there. :)
[...]
>> In the same way, it's possible to create many different "parse trees"
>> for a C-language "sentence", indicated here with indentation levels:
>>
>> if (x) if (x) if (x)
>> if (y) if (y) if (y)
>> foo(); foo(); foo();
>> else else else
>> bar(); bar(); bar();
>>
>> Only the leftmost parse tree is useful in understanding the semantics
>> of the C program; therefore, we call it "correct", and design our
>> grammars to create this parse tree in preference to the other two.
>
> How will you determine, in this special case or in general, which
> grammars or parse trees are "correct"?
The "correct" parse tree is the one that helps you solve your problem.
If you're compiling C, then I think it's (subjectively) obvious that the
leftmost parse tree is best suited for that task, and so you'd use a
grammar that was fleshed out enough to produce that kind of tree.
If you were just counting the number of tokens in the file, you might
use a "dumber" grammar that only knew about how to parse integer and
string literals, splice backslashes, and so on.
If you were just counting the number of commas in the file, you wouldn't
hardly need a grammar at all.
My point is, and was, that you were complaining about languages
admitting of "different parse trees" or "different grammars" as if that
was unexpected and/or a bad thing. It's not.
[then, on debugging yacc-alike parser generators...]
>> Again, you seem to confuse human-centered subjective terms like
>> "preserve the relationship" with measurable terms. Obviously there is
>> /some/ kind of special relationship between a yacc-generated parser
>> and its target language --- something that makes it parse C and not
>> Java or pig Latin. You're complaining that the relationship isn't
>> close enough to the surface for humans to see --- but why is that
>> important?
>
> Consider that typically you want to add actions to a grammar. Then you
> must know about the context (stack contents...), in which your code will
> execute.
>
> Finally you may want to debug your grammar and your code. In case of an
> recursive descent parser, the call stack contains much information about
> the parser state, whereas debugging the state transitions of an
> automaton requires additional debugging features.
...Which the tool should provide. (If it doesn't, then you have a bad
tool. Which I guess brings us back to "why are yacc-alikes so hard to
use?", which is code for "why are yacc-alikes targeted at system hackers
instead of normal people?", which practically answers itself: hackers
wrote yacc in the first place, and yacc is mainly used by hardcore
programmers who are going to spend a lot of time debugging anyway. :)
>> After all, the point of table-driven parsers is that humans never
>> /need/ to see the generated code. It "just works."
>
> I found some yacc clones, which all just do *not* work. Now tell me how
> somebody should figure out, what makes these tools misbehave :-(
Ah, well, there you have a problem with the /tool/. That's the
tool-maker's fault, not yours, and you shouldn't concern yourself with
it. ;)
I think I see where you're coming from, but the complaint in this
half of this subthread seems to be "yacc is hard," which I already agreed
with. When you say "yacc is hard because my code has bugs and yacc won't
help me find them," I find myself unsympathetic... maybe because I haven't
spent enough time debugging my own code yet! And when you say "yacc is
hard because /its/ code is full of bugs," I'm sympathetic, but don't see
how that's indicative of anything except laziness on /their/ part. (And
you're still better off than if the tools didn't exist at all, right?)
</software-engineering metadiscussion>
-Arthur
Return to the
comp.compilers page.
Search the
comp.compilers archives again.