|Re: [QUERY] A "ignorant newbie" question about compiler-writing. email@example.com (J. Kanze) (1997-01-30)|
|Re: [QUERY] A "ignorant newbie" question about compiler-writing. firstname.lastname@example.org (Mary Fernandez) (1997-02-11)|
|Re: [QUERY] A "ignorant newbie" question about compiler-writing. email@example.com (1997-02-16)|
|Re: [QUERY] A "ignorant newbie" question about compiler-writing. firstname.lastname@example.org (Norman Ramsey) (1997-02-20)|
|Re: recovery from syntax errors, was "ignorant newbie" question email@example.com (Jerry Leichter) (1997-02-22)|
|Re: recovery from syntax errors, was "ignorant newbie" question firstname.lastname@example.org (Ray Dillinger) (1997-02-22)|
|From:||Jerry Leichter <email@example.com>|
|Date:||22 Feb 1997 23:03:08 -0500|
|Organization:||System Management ARTS|
|References:||97-01-258 97-02-081 97-02-090 97-02-107|
| >It seems obvious that you cannot produce a compiler that will always
| >give a correct second error message, because the compiler cannot know
| >what I actually intended in place of the first error.
| Um, it may seem obvious, but it's not. The algorithm I described
| ... can produce reliable error messages about other parts of your
| program, which don't depend on the first error.
That depends on a highly specific notion of "error" and "depends on".
Consider a typed language with required declarations. X is declared to
be an integer. It is used n > 1 times in the program. Let's look at
the following n+1 cases:
0. All n uses are in locations where an integer is appropriate.
OK, the program is correct.
1. n-1 uses are in locations where an integer is appropriate;
in the remaining one, it isn't.
n. All n uses are in locations where an integer is *not*
What error messages should we expect? Well, in Case 1, we'd like to
see an error message at the one "bad" location. In Case n, it's
"clearly" the *declaration* that's incorrect, and ideally we'd like to
see one error message, not n of them. Where should the transition
between the error message styles occur? It's not at all clear.
After we've seen the first incorrect use, we have two choices: Mark the
variable "tainted", or not. If we mark it "tainted", we won't see a
message for the second incorrect use. If we don't, we don't mark it,
subsequent uses should will also produce messages. Both approaches are
defensible from a theoretical point of view. The first produces the
"right" result for Case n; the second, the "right" result for cases like
Case 2 or 3.
More generally, the algorithm assumes that there is some obvious notion
of "an" error, and further that the *first* (earliest in parse order)
occurance of something that also occurs elsewhere is "correct". What's
really going on in my examples is that we have a number of correlated
occurances, no one of which is necessarily in error - the error is in
their mutual incompatibility. When this occurs, you don't a priori have
any reason to choose one of the definitions as "correct". Any choice is
ultimately arbitrary. If want you want is "theoretical cleanlyness",
any choice is probably as good as any other. If what you are trying to
achieve is the best practical error reporting - there's much more work
to be done.
As another example: Suppose you see what is clearly a declaration for
X, but the declaration is erroneous, so that you can't determine what
type X is supposed to have. The current algorithm would report that
error, then skip over any expression in which X appeared. It would be
just as correct - and probably more useful - to declare X internally as
"declared but type not yet known", then try to infer the type from
subsequent uses. What matters is that the rest of the uses be
consistent with *some* type. Even if you can't determine what this type
is - even if you just ignore the type, in effect declaring X to have a
fictional "universal" type, compatible with everything - you can at
least check the expressions involved for other errors. The same thing
goes if there is no declaration for X at all.
Return to the
Search the comp.compilers archives again.