Related articles |
---|
LR-parser-based lexical analysis - does it work? soenke.kannapinn@wincor-nixdorf.com (=?iso-8859-1?Q?S=F6nke_Kannapinn?=) (2002-10-13) |
Re: LR-parser-based lexical analysis - does it work? cfc@shell01.TheWorld.com (Chris F Clark) (2002-10-18) |
Re: LR-parser-based lexical analysis - does it work? vmakarov@redhat.com (Vladimir N. Makarov) (2002-10-18) |
Re: LR-parser-based lexical analysis - does it work? vbdis@aol.com (VBDis) (2002-10-18) |
Re: LR-parser-based lexical analysis - does it work? brian-l-smith@uiowa.edu (Brian Smith) (2002-10-18) |
Re: LR-parser-based lexical analysis - does it work? grosch@cocolab.de (Josef Grosch) (2002-10-18) |
Re: LR-parser-based lexical analysis - does it work? zackw@panix.com (Zack Weinberg) (2002-10-20) |
From: | "Zack Weinberg" <zackw@panix.com> |
Newsgroups: | comp.compilers |
Date: | 20 Oct 2002 22:45:31 -0400 |
Organization: | PANIX -- Public Access Networks Corp. |
References: | 02-10-030 02-10-052 |
Keywords: | C, lex |
Posted-Date: | 20 Oct 2002 22:45:31 EDT |
VBDis <vbdis@aol.com> writes:
>"=?iso-8859-1?Q?S=F6nke_Kannapinn?=" <soenke.kannapinn@wincor-nixdorf.com> schreibt:
>
>>* If it doesn't work: Where are the problems with it? Do you know
>>counter-examples of programming languages where one can't do
>>lexical analysis like that?
>>(I know of Pascal's '..' problem; are there other problem cases?)
>
>Currently I'm trying to construct an C scanner and parser, for cross
>compilation. The C specification mentions more than 3 steps of lexical
>processing, before tokens can be created. IMO the only practical
>solution here is a multi-level scanner, which does all substitutions
>before passing the characters to the next stage.
Not so; a single-pass lexer for C is quite possible, although somewhat
of a pain to implement (it could be made much easier with only trivial
adjustments to the language, but that is a rant for another forum).
Newer (>3.x) GCC has such a lexer ("cpplib").
>I also had some problems with the C preprocessor, which must know
>about escaped and non-escaped line ends in #define. Also in #define
>the leading '(' of an argument list must immediately follow the
>identifier, with no whitespace allowed in between. In #include I had
>problems with the <file> syntax, because '<' is an operator in other
>contexts (expressions), and the allowed characters in a path
>specification differ from other (literal, identifier) character sets.
>To me this looks like a context sensitive lexical grammar?
Yes, indeed.
zw
Return to the
comp.compilers page.
Search the
comp.compilers archives again.