|Speeding up LEX scanning times email@example.com (1995-02-02)|
|Re: Speeding up LEX scanning times firstname.lastname@example.org (1995-02-02)|
|Re: Speeding up LEX scanning times c1veeru@WATSON.IBM.COM (Virendra K. Mehta) (1995-02-02)|
|Re: Speeding up LEX scanning times email@example.com (Stefan Monnier) (1995-02-03)|
|Re: Speeding up LEX scanning times firstname.lastname@example.org (1995-02-03)|
|Re: Speeding up LEX scanning times email@example.com (1995-02-04)|
|Re: Speeding up LEX scanning times firstname.lastname@example.org (1995-02-07)|
|From:||email@example.com (Bob Mercier)|
|Keywords:||lex, Cobol, performance|
|Organization:||Cinenet Communications,Internet Access,Los Angeles;310-301-4500|
|Date:||Fri, 3 Feb 1995 18:55:02 GMT|
Pieter Hintjens (firstname.lastname@example.org) wrote:
: I'm writing a Cobol parser, using MKS Lex and Yacc. So far so good.
: However, on seriously large programs, it is quite slow. When I profiled
: the code, I noticed that about 80% of the time was in the Lex scanner.
: Now, I found that the standard C functions for file access (fread) are
: a lot slower than the non-standard read functions, so I shaved off
: some time by using these if the compiler supports them.
: However, I still find that the scanner is slow. I don't think I made
: any mistakes; for instance all keywords are identified by looking-up
: a table, rather than as individual scanner tokens.
Are you saying that you use lex to collect an ID and then test
it against some keyword table? If so this is going to be slower
than just giving lex the list of keywords:
"if" return tIF;
"else" return tELSE;
: So my question is: should I consider writing the scanner by hand,
: now that I have a working prototype? If so, are there any techniques
: I should be aware of?
There is a great tool called 're2c', you should be able to find it
in archie. It's a slightly harder to use then lex but about twice
as fast as even flex in it's best speed mode. Like lex it builds
a DFA out of the strings it tries to match; lex spits out tables
describing the DFA and emits code to walk the tables as it scans
its input. re2c emits 'C' code directly matching the finite state
machine described by the DFA. It also allows a little more flexibility
in hooking into your i/o system, esp if you can mmap files.
Return to the
Search the comp.compilers archives again.