Re: flex scanner too huge, suggestions? (RKRayhawk)
23 Feb 2001 00:03:03 -0500

          From comp.compilers

Related articles
flex scanner too huge, suggestions? (Troy Cauble) (2001-02-17)
Re: flex scanner too huge, suggestions? (Tim Josling) (2001-02-23)
Re: flex scanner too huge, suggestions? (Scott Nicol) (2001-02-23)
Re: flex scanner too huge, suggestions? (2001-02-23)
Re: flex scanner too huge, suggestions? (2001-02-25)
Re: flex scanner too huge, suggestions? (Ron Pinkas) (2001-02-25)
Re: flex scanner too huge, suggestions? (Troy Cauble) (2001-03-01)
| List of all articles for this month |

From: (RKRayhawk)
Newsgroups: comp.compilers
Date: 23 Feb 2001 00:03:03 -0500
Organization: AOL
References: 01-02-097
Keywords: lex
Posted-Date: 23 Feb 2001 00:03:03 EST

Troy Cauble
Date: 17 Feb 2001 01:35:30 -0500
posts some stats on a build of a scanner


Made with LFLAGS = -Cfe, size gives
42068(.text) + 24(.data) + 60(.bss) + 1687886(.rodata) = 1730038

Made with LFLAGS = -Cem, size gives
42496(.text) + 24(.data) + 60(.bss) + 508326(.rodata) = 550906


If you have a language syntax requirement that is huge, then it is
huge. You seem to suggest this with the comments about trying to
master the 'context sensitivity' with start states. Yet seems like
the problem _might_ be elsewhere.

"Exactly what is introducing the large .rodata?", would be the
question. And here you may be looking at the libraries used for
executable construction at link time. You may wish to mention what
your OS+version is, and indicate any special libraries you are
bringing in.

You are using the word 'protocol' in your brief posts. What are you
telling us here, are you attaching any special software to the
front-end scanner? For example, are you bringing in anything to get
sockets, anything to get GUI services?

The problem could be in the one aspect that you identify most clearly,
and I do not mean to ignore it by the preceeding. The case
insensitivity _might_ bloat scanner tables; more (flex) experienced
lurkers are encouraged to comment on that. If so, then you might need
some manual conversion of strings to a single case (upper or lower)
and then drive simplistic regular expression compilation (that is,
back off from case-insensitive); but from what you have indicated that
will be a large amount of work for you.

Let me request you post an indication as to whether you have large
amounts of error processing in your program. My interest here is to
see if the bloat is the text from the error messages. (All
speculation, so ignore this if its a distraction). But if you have
hard coded your error messages internal to the scanner (applies
conceptually to parsers too), then the linked up result will look real
big. ((Engineers do this a lot. Start with a few simple messages ...
later as the program/system matures most of the executable image is
the text substance of diagnostic messages)). If this is relavant, the
solution is to externalize the text of diagnostics to a file and call
up the pieces you need, message by message, at execution time with
indexes or hashes (error numbers).

And one more risk of hurting your feelings by possibly being too
simplistic, I am just trying to pull out every rabit I can find in the
hat. If you have your compiler in debug mode, the executables can be
very large to hold symbol tables and all manner code point
identification for step-wise execution.

Hope that some of that is useful,

Robert Rayhawk

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.