Re: Why are LR parsers faster than using if conditions

"Paul B Mann" <paulbmann@yahoo.com>
28 Jul 2004 12:07:19 -0400

          From comp.compilers

Related articles
[7 earlier articles]
Re: Why are LR parsers faster than using if conditions haberg@matematik.su.se (Hans Aberg) (2004-06-28)
Re: Why are LR parsers faster than using if conditions clint@0lsen.net (Clint Olsen) (2004-06-28)
Re: Why are LR parsers faster than using if conditions tmoog@panix.com (2004-06-30)
Re: Why are LR parsers faster than using if conditions haberg@matematik.su.se (Hans Aberg) (2004-06-30)
Re: Why are LR parsers faster than using if conditions christian.bau@cbau.freeserve.co.uk (Christian Bau) (2004-07-11)
Re: Why are LR parsers faster than using if conditions try_vanevery_at_mycompanyname@yahoo.com (Brandon J. Van Every) (2004-07-13)
Re: Why are LR parsers faster than using if conditions paulbmann@yahoo.com (Paul B Mann) (2004-07-28)
| List of all articles for this month |
From: "Paul B Mann" <paulbmann@yahoo.com>
Newsgroups: comp.compilers
Date: 28 Jul 2004 12:07:19 -0400
Organization: Compilers Central
References: 04-06-012 04-06-034 04-06-072 04-06-098
Keywords: parse, LALR, performance, comment
Posted-Date: 28 Jul 2004 12:07:19 EDT

> I recall somebody said that typical compilers spend most time
> elsewhere than in the parser, e.g., the actions.


Yes, for compilers, parsing is about 5% or less, depending on
how much optimization is done.


But for searching text, parsing can be about 35-40% of the time. I'm
making an educated guess here.


> [The numbers I've seen say that the lexer is usually the slowest part
> of a compiler. -John]


Yes, a lexer is slower than a good LALR parser. But the lexer can be
made faster with some easy coding tricks.


In the compiler-front-ends built with a PG that I wrote many years
ago, the percentages were something like this:


Lexing time: 20%
Parsing time: 20%
Symbol-table lookup time: 15%
AST creation time: 15%
Other (I/O, etc): 30%


This was without doing any optimization on the AST and no code
was generated.


The early textbooks on compiler construction had the lexer calling
a function, getc(), for each character in the input stream. This was
a killer for performance. You can avoid this problem by loading
all or lots of the input source file into memory and incrementing a
pointer to move from one character to the next.


Paul B Mann
paulbmann@yahoo.com
[These days getc() is invariably a macro expanded in line, but I
agree that slurping up a bunch of text into a local buffer like
flex does is likely to be faster. -John]


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.