Related articles |
---|
Problems lexing out tokens dos_programmer@yahoo.com (JMB) (2002-10-13) |
Re: Problems lexing out tokens vbdis@aol.com (VBDis) (2002-10-18) |
From: | "VBDis" <vbdis@aol.com> |
Newsgroups: | comp.compilers |
Date: | 18 Oct 2002 23:09:07 -0400 |
Organization: | AOL Bertelsmann Online GmbH & Co. KG http://www.germany.aol.com |
References: | 02-10-023 |
Keywords: | lex |
Posted-Date: | 18 Oct 2002 23:09:06 EDT |
"JMB" <dos_programmer@yahoo.com> schreibt:
>Secondly, string literals, which I consider the job of
>the Scanner to recognize the entire string, can span multiple lines,
>without violating the language rules.
AFAIK in Pascal literals cannot span lines, only comments can do so.
>Anyway, what is possibly an elegant solution to this? I thought about
>having a boolean, isComplete, to let Main know if the Scanner was done
>getting the remainder of the token, and to allow Scanner to pick up
>where it left off. That's fine -- except for when I encounter
>whitespace again.
Let the scanner have various states, which are checked when the next
token is requested. Since in Pascal only comments can span lines, a
boolean variable inComment may be sufficient. In other languages more
states may be required. The scanner also can deliver according
tokens, so that in your case the parser can recognize continued
comments in new lines, if ever required.
My solution for long tokens is to retain the source code in memory,
and have references to that text in the tokens. This approach allows
for tokens of any size, regardless of the string length. The scanner
only stops after parsing a full token. Your output creator then can
split lines or tokens as required.
BTW, I wonder what's the use for HTML files in a DOS envrionment? Why
don't you switch your development tools, together with the
environment?
DoDi
Return to the
comp.compilers page.
Search the
comp.compilers archives again.