Related articles |
---|
Re: What stage should entities be resolved? christopher.f.clark@compiler-resources.com (Christopher F Clark) (2022-03-12) |
Re: What stage should entities be resolved? DrDiettrich1@netscape.net (Hans-Peter Diettrich) (2022-03-14) |
Re: What stage should entities be resolved? costello@mitre.org (Roger L Costello) (2022-03-15) |
Re: What stage should entities be resolved? DrDiettrich1@netscape.net (Hans-Peter Diettrich) (2022-03-18) |
Re: What stage should entities be resolved? gah4@u.washington.edu (gah4) (2022-03-17) |
Re: What stage should entities be resolved? 480-992-1380@kylheku.com (Kaz Kylheku) (2022-03-18) |
Re: What stage should entities be resolved? gah4@u.washington.edu (gah4) (2022-03-18) |
Re: What stage should entities be resolved? martin@gkc.org.uk (Martin Ward) (2022-03-19) |
Re: What stage should entities be resolved? matt.timmermans@gmail.com (matt.ti...@gmail.com) (2022-03-20) |
From: | gah4 <gah4@u.washington.edu> |
Newsgroups: | comp.compilers |
Date: | Fri, 18 Mar 2022 14:08:26 -0700 (PDT) |
Organization: | Compilers Central |
References: | 22-03-019 22-03-025 22-03-032 22-03-037 |
Injection-Info: | gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="21815"; mail-complaints-to="abuse@iecc.com" |
Keywords: | parse, Fortran |
Posted-Date: | 18 Mar 2022 18:43:30 EDT |
In-Reply-To: | 22-03-037 |
On Friday, March 18, 2022 at 11:00:40 AM UTC-7, Kaz Kylheku wrote:
> So to answer the questions, if you're assuming that you're going to be
> using the traditional framework, with a regular-expression-driven
> lexer and a LR parser with 1 token of lookahead, the way you divide
> the work, roughly, is by identifying the token-like elements in the
> language that have regular syntax. Anything that exhibits nesting,
> requiring rule recursion, will be farmed off to the parser. Your
> decision could sometimes also be informed by the lookahead concern;
> you could choose to clump together several items that might plausibly
> be individual tokens into a "supertoken" if there is some parsing
> advantage in it, or other simplification.
Besides simplifying parsing, it also makes better error messages.
For one, if it has a whole token, the message can indicate that.
> Other kinds of decisions interact with the language definition.
> For instance, what is -1234? In Common Lisp, and most other Lisp
> dialect, I suspect, that is a token denoting a negative integer.
> It may not be written - 1234. In C, and languages imitating its syntax,
> -1234 is a unary expression: the unary - operator applied to the
> integer 1234, and so there are two tokens. They may be separated
> by whitespace, or a comment.
> This strikes at the language definition, becuase what it implies is
> that C does not have negative constants.
It gets more interesting in Fortran.
Fortran mostly doesn't have signed constants, so it would be a unary
expression, except that there are some places where constants are allowed
and not expressions, so in those cases it allows for signed constants.
The most important one, and maybe only one left, is the DATA statement.
Well, it used to be that constants and variables, but not expressions, were
allowed for DO statements, but then again it didn't even allow for 0,
and especially not negative constants.
I suspect, though, that even though DATA statements are defined to have
signed constants, that the parser can allow unary expressions and not
tell anyone. Well, it might affect error messages, too.
Return to the
comp.compilers page.
Search the
comp.compilers archives again.