Re: What stage should entities be resolved?

Roger L Costello <costello@mitre.org>
Tue, 15 Mar 2022 11:49:15 +0000

          From comp.compilers

Related articles
Re: What stage should entities be resolved? christopher.f.clark@compiler-resources.com (Christopher F Clark) (2022-03-12)
Re: What stage should entities be resolved? DrDiettrich1@netscape.net (Hans-Peter Diettrich) (2022-03-14)
Re: What stage should entities be resolved? costello@mitre.org (Roger L Costello) (2022-03-15)
Re: What stage should entities be resolved? DrDiettrich1@netscape.net (Hans-Peter Diettrich) (2022-03-18)
Re: What stage should entities be resolved? gah4@u.washington.edu (gah4) (2022-03-17)
Re: What stage should entities be resolved? 480-992-1380@kylheku.com (Kaz Kylheku) (2022-03-18)
Re: What stage should entities be resolved? gah4@u.washington.edu (gah4) (2022-03-18)
Re: What stage should entities be resolved? martin@gkc.org.uk (Martin Ward) (2022-03-19)
Re: What stage should entities be resolved? matt.timmermans@gmail.com (matt.ti...@gmail.com) (2022-03-20)
| List of all articles for this month |
From: Roger L Costello <costello@mitre.org>
Newsgroups: comp.compilers
Date: Tue, 15 Mar 2022 11:49:15 +0000
Organization: Compilers Central
References: 22-03-019 22-03-025
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="73851"; mail-complaints-to="abuse@iecc.com"
Keywords: parse, design
Posted-Date: 17 Mar 2022 14:41:44 EDT
Content-Language: en-US



Thank you DoDi, Chris, and Matt. You have provided truly exceptional information.


One thing that I am still unclear about is this:


How much knowledge of the language should each stage have?


For instance, as I understand it a C preprocessor goes through a C program and replaces macros. With this:


#define PI 3.14


the preprocessor will convert this:


area = PI * radius * radius;


to this:


area = 3.14 * radius * radius;


But if PI is inside a quoted string:


"Today is PI day"


then the preprocessor does not replace PI.


So the preprocessor has some knowledge about the language: If a macro is within a quoted string, then don’t replace it.


Similarly, in XML if &amp; is embedded inside a CDATA section:


<![CDATA[&amp;]]>


then a preprocessor must not replace &amp; with &. That is, the preprocessor must have knowledge about the language: If an XML entity is within a CDATA section, then don’t replace it.


So that brings me to my questions:


1. How much knowledge of the language should the preprocessor stage have?


2. How much knowledge of the language should the lexical analysis stage have?


3. How much knowledge of the language should the syntax analysis stage have?


4. How much knowledge of the language should the semantic analysis stage have?


To make the questions concrete, consider this XML:


<foo>…</foo>


Should the lexical analysis stage know that the foo in <foo> is a
start tag (STAG) and the foo in </foo> is an end tag (ETAG)? That
would mean the lexical analysis stage has considerable knowledge of
the XML language. Or should the lexical analysis stage simply identify
the foo in <foo> as a name (NAME) and the foo in </foo> as a name
(NAME)? That would mean the lexical analysis stage has lesser
knowledge of the XML language. How much knowledge of the XML language
should the lexical analysis stage have?


/Roger
[I'd say start and end tags are sufficiently basic that they are different
but your life is made more complicated with tags like <foo opt="42" />
which are both. Since you'll need to parse attributes, lexemes like
< </ > /> make sense.-John]



Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.