Related articles |
---|
Re: What stage should entities be resolved? christopher.f.clark@compiler-resources.com (Christopher F Clark) (2022-03-12) |
Re: What stage should entities be resolved? DrDiettrich1@netscape.net (Hans-Peter Diettrich) (2022-03-14) |
Re: What stage should entities be resolved? costello@mitre.org (Roger L Costello) (2022-03-15) |
Re: What stage should entities be resolved? DrDiettrich1@netscape.net (Hans-Peter Diettrich) (2022-03-18) |
Re: What stage should entities be resolved? gah4@u.washington.edu (gah4) (2022-03-17) |
Re: What stage should entities be resolved? 480-992-1380@kylheku.com (Kaz Kylheku) (2022-03-18) |
Re: What stage should entities be resolved? gah4@u.washington.edu (gah4) (2022-03-18) |
Re: What stage should entities be resolved? martin@gkc.org.uk (Martin Ward) (2022-03-19) |
Re: What stage should entities be resolved? matt.timmermans@gmail.com (matt.ti...@gmail.com) (2022-03-20) |
From: | Roger L Costello <costello@mitre.org> |
Newsgroups: | comp.compilers |
Date: | Tue, 15 Mar 2022 11:49:15 +0000 |
Organization: | Compilers Central |
References: | 22-03-019 22-03-025 |
Injection-Info: | gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="73851"; mail-complaints-to="abuse@iecc.com" |
Keywords: | parse, design |
Posted-Date: | 17 Mar 2022 14:41:44 EDT |
Content-Language: | en-US |
Thank you DoDi, Chris, and Matt. You have provided truly exceptional information.
One thing that I am still unclear about is this:
How much knowledge of the language should each stage have?
For instance, as I understand it a C preprocessor goes through a C program and replaces macros. With this:
#define PI 3.14
the preprocessor will convert this:
area = PI * radius * radius;
to this:
area = 3.14 * radius * radius;
But if PI is inside a quoted string:
"Today is PI day"
then the preprocessor does not replace PI.
So the preprocessor has some knowledge about the language: If a macro is within a quoted string, then don’t replace it.
Similarly, in XML if & is embedded inside a CDATA section:
<![CDATA[&]]>
then a preprocessor must not replace & with &. That is, the preprocessor must have knowledge about the language: If an XML entity is within a CDATA section, then don’t replace it.
So that brings me to my questions:
1. How much knowledge of the language should the preprocessor stage have?
2. How much knowledge of the language should the lexical analysis stage have?
3. How much knowledge of the language should the syntax analysis stage have?
4. How much knowledge of the language should the semantic analysis stage have?
To make the questions concrete, consider this XML:
<foo>…</foo>
Should the lexical analysis stage know that the foo in <foo> is a
start tag (STAG) and the foo in </foo> is an end tag (ETAG)? That
would mean the lexical analysis stage has considerable knowledge of
the XML language. Or should the lexical analysis stage simply identify
the foo in <foo> as a name (NAME) and the foo in </foo> as a name
(NAME)? That would mean the lexical analysis stage has lesser
knowledge of the XML language. How much knowledge of the XML language
should the lexical analysis stage have?
/Roger
[I'd say start and end tags are sufficiently basic that they are different
but your life is made more complicated with tags like <foo opt="42" />
which are both. Since you'll need to parse attributes, lexemes like
< </ > /> make sense.-John]
Return to the
comp.compilers page.
Search the
comp.compilers archives again.