2 Oct 2005 02:53:05 -0400

Related articles |
---|

[4 earlier articles] |

Re: table compression mickunas@cs.uiuc.edu (Dennis Mickunas) (2001-11-08) |

Table compression leonardo@dcc.ufmg.br (Leonardo Teixeira Passos) (2005-09-27) |

Re: Table compression anton@mips.complang.tuwien.ac.at (2005-09-30) |

Re: Table compression hannah@schlund.de (2005-09-30) |

Re: Table compression cleos@nb.sympatico.ca (Cleo Saulnier) (2005-09-30) |

Re: Table compression Peter_Flass@Yahoo.com (Peter Flass) (2005-10-02) |

Re: Table compression paul@parsetec.com (Paul Mann) (2005-10-02) |

Re: Table compression cfc@shell01.TheWorld.com (Chris F Clark) (2005-10-03) |

Re: Table compression henry@spsystems.net (2005-10-13) |

From: | "Paul Mann" <paul@parsetec.com> |

Newsgroups: | comp.compilers |

Date: | 2 Oct 2005 02:53:05 -0400 |

Organization: | Parsetec |

References: | 05-09-130 |

Keywords: | parse |

Posted-Date: | 02 Oct 2005 02:53:05 EDT |

*> I'm looking foward to developing a study on table compression techniques.*

*> When browsing the Web, I haven't found any recent articles (at least from*

*> the last decade). Does anyone have a clue as where as to begin?*

I found this to be very helpful, but not complete in implementation

details. I had to finish the job when putting it into practice.

"Optimization of Parser Tables for Portable Compilers",

Paper from TOPLAS Oct 1984, by Dencker, Durre and Heuft.

*> [Since computer memories have gotten so big, does anyone care about*

*> table compression any more? Now, a typical Windows program has a*

*> megabyte of unused libraries that aren't worth stripping out, so*

*> why waste time with the tables? -John]*

If you are writing a parser generator for all languages, including

experimental parsing, such as all XML dialects in one parser table,

you can end up with as many as 30,000 rules in a grammar.

The parser-table size should be computed with the formula:

S x (N+T) x 2 bytes/entry

Where S is the number of states,

and N is the number of nonterminals,

and T is the number of terminals.

For COBOL-85 the numbers would be :

1200 x (450+350) x 2 = 1,920,000 bytes = 1,875 K = 1.83 MB.

This is assuming the use of a matrix style array indexed

by the state number and symbol of the grammar to get

either the next state number (+number) or the reduction to

make (-number). This is for uncompressed parser tables.

This also assumes that no shift-reduce parsing actions are

used. Using these will eliminate about 30% of the states.

LRgen uses shift-reduce actions and has about 900 states. After

compressing the tables, they are 67,621 bytes = 66 K.

The parsers are nearly as fast as using uncompressed tables.

The difference in speed would be unnoticeable.

Some other parser generators may give smaller sizes

depending on which table compression technique is used.

Of course, a list style of parser tables would be much

smaller, but not likely smaller than the compressed matrix

tables. And the list format is slower and speed may

decreases with the increase in size of the grammar as the

number of actions per state increases.

I'm in favor of compressing the parser tables. Some users might

want to have one parser and many parser tables in an application.

One of LRgen's users is doing that with dialects of SQL which are

larger than COBOL.

Paul Mann

http://parsetec.com

Post a followup to this message

Return to the
comp.compilers page.

Search the
comp.compilers archives again.