Re: lexing backwards

"Stan Zaborowski" <stan@zaborowski.org>
13 Apr 2003 12:18:16 -0400

          From comp.compilers

Related articles
lexing backwards monnier+comp.compilers/news/@rum.cs.yale.edu (Stefan Monnier) (2003-04-05)
Re: lexing backwards haberg@math.su.se (2003-04-07)
Re: lexing backwards cfc@world.std.com (Chris F Clark) (2003-04-07)
Re: lexing backwards maratb@cs.berkeley.edu (Marat Boshernitsan) (2003-04-07)
Re: lexing backwards stan@zaborowski.org (Stan Zaborowski) (2003-04-13)
Re: lexing backwards Ron@Profit-Master.com (Ron Pinkas) (2003-04-13)
Re: lexing backwards monnier+comp.compilers/news/@rum.cs.yale.edu (Stefan Monnier) (2003-04-15)
Re: lexing backwards cfc@TheWorld.com (Chris F Clark) (2003-04-15)
Re: lexing backwards genew@mail.ocis.net (2003-05-06)
Re: lexing backwards Ron@Profit-Master.com (Ron Pinkas) (2003-05-14)
Re: lexing backwards Ron@Profit-Master.com (Ron Pinkas) (2003-05-16)
[3 later articles]
| List of all articles for this month |

From: "Stan Zaborowski" <stan@zaborowski.org>
Newsgroups: comp.compilers
Date: 13 Apr 2003 12:18:16 -0400
Organization: Posted via Supernews, http://www.supernews.com
References: 03-04-015 03-04-026
Keywords: lex
Posted-Date: 13 Apr 2003 12:18:16 EDT

"Chris F Clark" <cfc@world.std.com> wrote in message
> There is no trick to lexing backwards. Lexing locally, however, is in
> general impossible.


Chris, I was in total agreement with you until I got to thinking about
comments. And in particular I am thinking about languages that do not
have a closing comment delimiter but use end-of-line as the closing
delimiter. Examples would be "//" in C++ and "#" in Perl.


So attempting to parse backwards would create tokens that must be
thrown away once you realize that you are in a comment. This doesn't
seem too bad for C++. But consider what happens with Perl. You may
not use quoted strings which contain an end-of-line in C++. (Note you
can include a new-line "\n" which is different from end-of-line). But
there is no such restriction in Perl. And the comment delimiter can
be put inside a quoted string.


So you would have to parse back to the beginning of the program before
you can decide that a single quote mark is part of a comment and not a
string delimiter.


This might sound unlikely, but Perl programmers often create Perl
programs that write Perl programs. So a large number of what may seem
like Perl tokens are actually only part of a string. And people do
put comments for the generated program inside this string.


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.