Re: Problems lexing out tokens

"VBDis" <>
18 Oct 2002 23:09:07 -0400

          From comp.compilers

Related articles
Problems lexing out tokens (JMB) (2002-10-13)
Re: Problems lexing out tokens (VBDis) (2002-10-18)
| List of all articles for this month |

From: "VBDis" <>
Newsgroups: comp.compilers
Date: 18 Oct 2002 23:09:07 -0400
Organization: AOL Bertelsmann Online GmbH & Co. KG
References: 02-10-023
Keywords: lex
Posted-Date: 18 Oct 2002 23:09:06 EDT

"JMB" <> schreibt:

>Secondly, string literals, which I consider the job of
>the Scanner to recognize the entire string, can span multiple lines,
>without violating the language rules.

AFAIK in Pascal literals cannot span lines, only comments can do so.

>Anyway, what is possibly an elegant solution to this? I thought about
>having a boolean, isComplete, to let Main know if the Scanner was done
>getting the remainder of the token, and to allow Scanner to pick up
>where it left off. That's fine -- except for when I encounter
>whitespace again.

Let the scanner have various states, which are checked when the next
token is requested. Since in Pascal only comments can span lines, a
boolean variable inComment may be sufficient. In other languages more
states may be required. The scanner also can deliver according
tokens, so that in your case the parser can recognize continued
comments in new lines, if ever required.

My solution for long tokens is to retain the source code in memory,
and have references to that text in the tokens. This approach allows
for tokens of any size, regardless of the string length. The scanner
only stops after parsing a full token. Your output creator then can
split lines or tokens as required.

BTW, I wonder what's the use for HTML files in a DOS envrionment? Why
don't you switch your development tools, together with the


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.