|Lex state table format email@example.com (Brad Fish) (1998-09-22)|
|Re: Lex state table format firstname.lastname@example.org (1998-09-26)|
|Re: Lex state table format email@example.com (Quinn Tyler Jackson) (1998-09-29)|
|Date:||26 Sep 1998 01:46:29 -0400|
|Organization:||Deja News - The Leader in Internet Discussion|
Brad Fish <firstname.lastname@example.org> wrote:
> Can anyone give me a quick rundown on the Lex state table format? I'm
> looking to create my own scanner generator. I understand how Lex translates
> its input into a series of states, but what's the best way to store those
Dear Brad, The simplest format for storing the Lex state table
(assuming that the Reg expr to NFA to DFA has been done) is in the
form of a two dimensional array of integers. The rows represent the
states generated in the DFA while the columns represent the character
set to be represented. Assume A is that table, then entry A[i, c]
gives the next state of the scanner if in the current state 'i' the
next character in the input is 'c'. But this may turn out to be a bit
memory inefficient. Thats because for many of the characters in the
character set, the values A[i, c] turn out to be same, and the size of
the character set supported is also large (typically, A-Z,a-z,0-9 and
some extra characters). This is also true for the rows in the table.
Instead you can try to run a table compression algorithm where in you
have an extra level of indirection but its very memory efficient. You
can either do row level compression, column level compression, or
both. You have the table as defined earlier but the width is not
equal to the complete character set. Instead you allocate a column of
the table for those columns that are different from others that have
been created as of yet. If a column generated is already present
then, map it to the already existing one in the index table. For row
level compression, you can use a partioning algorithm to eliminate
redundant states in the DFA generated.
I hope this helps.
Return to the
Search the comp.compilers archives again.