Related articles |
---|
[16 earlier articles] |
Re: is lex useful? 72510.2757@CompuServe.COM (Stephen Lindholm) (1996-06-27) |
Re: is lex useful? kanze@lts.sel.alcatel.de (1996-06-27) |
Re: is lex useful? bart@time.cirl.uoregon.edu (1996-06-30) |
Re: is lex useful? Robert.Corbett@Eng.Sun.COM (1996-06-30) |
Re: is lex useful? leichter@smarts.com (1996-06-30) |
Re: is lex useful? trd@murlibobo.cs.mu.OZ.AU (1996-06-30) |
Re: is lex useful? WStreett@shell.monmouth.com (1996-06-30) |
Re: is lex useful? dmr@bell-labs.com (1996-06-30) |
Re: is lex useful? clark@quarry.zk3.dec.com (1996-07-01) |
Re: is lex useful? bromage@cs.mu.OZ.AU (1996-07-02) |
Re: is lex useful? kanze@lts.sel.alcatel.de (1996-07-02) |
Re: is lex useful? colas@aye.inria.fr (1996-07-04) |
Re: is lex useful? trd@lister.cs.mu.OZ.AU (1996-07-05) |
From: | WStreett@shell.monmouth.com (Wilbur Streett) |
Newsgroups: | comp.compilers |
Date: | 30 Jun 1996 16:48:10 -0400 |
Organization: | Monmouth Internet Corporation |
References: | 96-06-073 96-06-094 96-06-116 96-06-125 |
Keywords: | lex, performance |
Scott Stanchfield <scooter@mccabe.com> wrote:
>I'd be interested in how the flex scanner was written here.
>You say you optimized the hand scanner for speed. Did you do the
>same for the flex lexer? I mean things like how you handled keywords
>(flex RE for each, or hash-lookup when something looks like an IDENT?)
I would imagine not.
>I'd tend to agree with your conclusion overall, but your numbers seem to
>show a 60% increase in throughput. Are the two really being compared on
>equal ground?
I wrote a Lex/YACC (FLEX/BISON) based MIB Compiler. (my first compiler..)
The compiler parses a MIB into internal data structures and creates text
files with the data out of the Object Definitions for the sake of
persistant object storage. It also creates an HTML file for front end use
for WWW browsers. The compiler runs as a console application under Windows
95. On my 100 Mghz Pentium, 24 MB RAM, 1 GB hard disk.. (In other words..
nothing fancy about the hardware..) The first cut parsed the RMON MIB,
3,354 lines, in about 20 seconds..
After I figured out where the profiler in MSVC 4.0 was hiding, (it doesn't
install unless you COPY if off the CD) and I figured out that the code
generation problems that I was having were not from by FLEX/BISON port to
windows.. I read all of the internal code in FLEX and BISON.. removed all
suspect file I/O code.. and then discovered a major bug in MSVC 4.0 IDE
file buffering for custom compiles.. (lost a day on that one..). Anyway,
once I got the profiler running and figured out that waiting after the
total rebuild in MSVC would fail, and fail, and fail, unless I waited about
10 seconds and then hit rebuild... I spent some time optimizing the MIB
Compiler. Mostly for fun, since the client had already accepted the
compiler.
Please note that for the sake of being able to run the MS Profiler, the
compiler must be set for full debug information and no-optimizations.
The results of the first profile run are below..
****************************************************
Profile: Function timing, sorted by time
Date: Fri Jan 12 09:20:56 1996
Program Statistics
------------------
Command line at 1996 Jan 12 09:20: mc rfc1271.mib
Total time: 20277.796 millisecond
Time outside of functions: 34.984 millisecond
Call depth: 9
Total functions: 38
Total hits: 53222
Function coverage: 71.1%
Overhead Calculated 15
Overhead Average 15
Module Statistics for mc.exe
----------------------------
Time in module: 20242.812 millisecond
Percent of time in module: 100.0%
Functions in module: 38
Hits in module: 53222
Module function coverage: 71.1%
Func Func+Child Hit
Time % Time % Count Function
---------------------------------------------------------
10698.332 52.9 11409.961 56.4 213 _add_MIB_Object (subr.obj)
2369.121 11.7 2421.368 12.0 203 _process_text (subr.obj)
1727.863 8.5 1902.058 9.4 4428 _yylex (mc3l.obj)
1341.668 6.6 1341.668 6.6 45583 _cPeriods (subr.obj)
1320.350 6.5 20094.819 99.3 1 _yyparse (mc3_tab.obj)
775.032 3.8 775.032 3.8 1 _output_HTML (subr.obj)
427.847 2.1 1769.500 8.7 213 _Print_Sub_OIDs (subr.obj)
426.948 2.1 584.406 2.9 213 _print_OTM (subr.obj)
363.081 1.8 2784.449 13.8 1 _output_OTM (subr.obj)
172.846 0.9 172.868 0.9 19 _yy_get_next_buffer
(mc3l.obj)
157.457 0.8 157.457 0.8 203 _print_ASN (subr.obj)
120.101 0.6 120.101 0.6 1461 _cfree (subr.obj)
104.739 0.5 106.043 0.5 1 _init_MIB_Compiler
(subr.obj)
60.449 0.3 60.449 0.3 10 _add_ENUM_Object (subr.obj)
59.369 0.3 127.223 0.6 213 _OIDValue_free (subr.obj)
47.370 0.2 47.370 0.2 426 _print_oid_part (subr.obj)
41.951 0.2 20242.812 100.0 1 _main (subr.obj)
25.635 0.1 1795.150 8.9 1 _output_OID_Tree (subr.obj)
1.304 0.0 1.304 0.0 1 _initOIDStubTree (subr.obj)
1.146 0.0 1.146 0.0 2 _yy_flex_alloc (mc3l.obj)
0.153 0.0 0.153 0.0 18 _yy_get_previous_state
(mc3l.obj)
0.019 0.0 0.033 0.0 2 _yy_init_buffer (mc3l.obj)
0.010 0.0 0.010 0.0 3 _yy_load_buffer_state
(mc3l.obj)
0.008 0.0 1.168 0.0 1 _yy_create_buffer (mc3l.obj)
0.008 0.0 0.013 0.0 2 _yy_flush_buffer (mc3l.obj)
0.003 0.0 0.023 0.0 1 _yyrestart (mc3l.obj)
0.002 0.0 0.002 0.0 1 _yywrap (subr.obj)
********************************************
So the routines that were taking up the most time were
10698.332 52.9 11409.961 56.4 213 _add_MIB_Object (subr.obj)
2369.121 11.7 2421.368 12.0 203 _process_text (subr.obj)
1727.863 8.5 1902.058 9.4 4428 _yylex (mc3l.obj)
1341.668 6.6 1341.668 6.6 45583 _cPeriods (subr.obj)
1320.350 6.5 20094.819 99.3 1 _yyparse (mc3_tab.obj)
I ran a line by line profile overnight, and discovered that some inner
loops were executing more than 1 million times.. remove those quick!
Anyway..
add_MIB_Object() had a lot of debug code in it, (as did the entire
compiler.. wanted to see the state engine work..) which I wrote into the
compiler with a runtime flag. I was doing sprintf()'s and then fprintf()
ing to the to the output handle. NULL if debug flag had not been set.. but
still doing the printf overhead.. Fine, wrap all of the debug code with
if (debug) { }
process_text() did a lot byte by byte conversion, looking for non-white
characters, left alignment, removing tabs.. and other stuff.. remove all of
the internal loops, remove the sprintf()'s, wrap the debug code.. general
clean up.
yylex() switch to the fast buffered file mode, large tables, etc (set 3
flags..) In short, read the documentation... Don't forget to wrap the
trace code with all of it's printf() sprintf() debug code.. with if (debug)
{ }
cPeriods() Counts the periods in the OID's.. Set up the data structure so
that it counts the periods of the OID's once, and stores the result, rather
than everytime the OID's are compared.
yyparse().. set a couple of BISON flags..
Generally wrap as many printf and sprintf etc, calls inside if (debug) {
}.. as possible.. NEVER AGAIN!
Other small optimizations...
After that it runs in 2.9 seconds..
The profiler results after some optimizations ..
*********************************************
Profile: Function timing, sorted by time
Date: Sat Jan 13 12:26:19 1996
Program Statistics
------------------
Command line at 1996 Jan 13 12:26: mc rfc1271.mib
Total time: 2865.465 millisecond
Time outside of functions: 18.816 millisecond
Call depth: 9
Total functions: 38
Total hits: 7674
Function coverage: 65.8%
Overhead Calculated 13
Overhead Average 13
Module Statistics for mc.exe
----------------------------
Time in module: 2846.649 millisecond
Percent of time in module: 100.0%
Functions in module: 38
Hits in module: 7674
Module function coverage: 65.8%
Func Func+Child Hit
Time % Time % Count Function
---------------------------------------------------------
814.680 28.6 904.983 31.8 4428 _yylex (mc3l.obj)
545.892 19.2 545.892 19.2 1 _output_HTML (subr.obj)
337.486 11.9 564.963 19.8 1 _output_OTM (subr.obj)
218.968 7.7 327.203 11.5 213 _add_MIB_Object (subr.obj)
189.463 6.7 227.477 8.0 203 _process_text (subr.obj)
175.296 6.2 2761.082 97.0 1 _yyparse (mc3_tab.obj)
139.625 4.9 141.736 5.0 213 _Print_Sub_OIDs (subr.obj)
87.461 3.1 87.485 3.1 21 _yy_get_next_buffer
(mc3l.obj)
75.251 2.6 75.251 2.6 1461 _cfree (subr.obj)
68.453 2.4 105.691 3.7 213 _OIDValue_free (subr.obj)
57.915 2.0 2846.649 100.0 1 _main (subr.obj)
48.405 1.7 48.405 1.7 10 _add_ENUM_Object (subr.obj)
39.418 1.4 39.418 1.4 426 _print_oid_part (subr.obj)
25.050 0.9 27.652 1.0 1 _init_MIB_Compiler
(subr.obj)
13.186 0.5 154.922 5.4 1 _output_OID_Tree (subr.obj)
5.634 0.2 5.634 0.2 447 _cPeriods (subr.obj)
1.624 0.1 2.602 0.1 1 _initOIDStubTree (subr.obj)
1.370 0.0 2.500 0.1 1 _yy_create_buffer (mc3l.obj)
1.111 0.0 1.111 0.0 2 _yy_flex_alloc (mc3l.obj)
0.311 0.0 0.311 0.0 20 _yy_get_previous_state
(mc3l.obj)
0.023 0.0 0.035 0.0 2 _yy_init_buffer (mc3l.obj)
0.009 0.0 0.012 0.0 2 _yy_flush_buffer (mc3l.obj)
0.008 0.0 0.008 0.0 3 _yy_load_buffer_state
(mc3l.obj)
0.005 0.0 0.023 0.0 1 _yyrestart (mc3l.obj)
0.003 0.0 0.003 0.0 1 _yywrap (subr.obj)
*********************************************************************
I didn't do any optimizations as far as changing the order of the LEX
parsing, setting up the regexp to find the larger tokens first, or other
other changes.. after all, I was just doing this for fun, the client
thought that 20 seconds was plenty fast.. After all, he had a machine with
lots of RAM running NT, 166 Mghz Pentium.
When full optimization of MSVC is set when the MIB Compiler is complied,
the resultant MIB Compiler compiles the MIB in 1.3 seconds, (Windows timer,
take with a grain of salt..) I would have pumped up the priority,
optimized the LEX definitions so that the larger tokens get handled first..
and a lot of other things.. but, why bother? ( Breaking my own arm while I
pat myself on the back.. 20 seconds to 1.3 seconds .. I'm a God! ) (
Well, OK, I did write some crap in the first place..! )
I didn't do much other than tell FLEX to use the fast tables, use file mode
rather than character mode, replace a few of the inner loops in the code
with better algorithms.. and put if's around the debug code..
Vern Paxson maintains FLEX for free.. (but I bought him lunch!) Of course,
flying out to Berkley was was a bit more than lunch..
The Lex file uses states, and is quite clear in it's design. My MIB
Compiler compiles the RMON MIB - RFC1271, which is 3,354 lines, in 1.3
seconds, in the process generating persistant object storage and HTML
files..? Who am I to complain?
I'd be pretty surprised if someone would be able to get a 60% increase by
not using LEX, after all, the current version of my compiler is only in the
lex routines, (which includes some C code to handle the parsed tokens..),
31.8% of the total processing time. I haven't fully optimized the LEX
definitions, or the C routines that are in called in the LEX code.
So .. is Lex useful? YES .. VERY! I'm currently working out the idea of
writing an HTML parser and adding Object Tags of my own design to create
templates for dynamic web page design. Without Lex I wouldn't be very
interested in attempting that sort of effort.
Now if I can just find a LEX file for HTML!
Wilbur
--
Return to the
comp.compilers page.
Search the
comp.compilers archives again.