Re: Lex state table format

mravirala@my-dejanews.com
26 Sep 1998 01:46:29 -0400

          From comp.compilers

Related articles
Lex state table format bfish@netxs.net (Brad Fish) (1998-09-22)
Re: Lex state table format mravirala@my-dejanews.com (1998-09-26)
Re: Lex state table format qjackson@wave.home.com (Quinn Tyler Jackson) (1998-09-29)
| List of all articles for this month |
From: mravirala@my-dejanews.com
Newsgroups: comp.compilers
Date: 26 Sep 1998 01:46:29 -0400
Organization: Deja News - The Leader in Internet Discussion
References: 98-09-090
Keywords: lex

    Brad Fish <bfish@netxs.net> wrote:
> Can anyone give me a quick rundown on the Lex state table format? I'm
> looking to create my own scanner generator. I understand how Lex translates
> its input into a series of states, but what's the best way to store those
> states?


Dear Brad, The simplest format for storing the Lex state table
(assuming that the Reg expr to NFA to DFA has been done) is in the
form of a two dimensional array of integers. The rows represent the
states generated in the DFA while the columns represent the character
set to be represented. Assume A is that table, then entry A[i, c]
gives the next state of the scanner if in the current state 'i' the
next character in the input is 'c'. But this may turn out to be a bit
memory inefficient. Thats because for many of the characters in the
character set, the values A[i, c] turn out to be same, and the size of
the character set supported is also large (typically, A-Z,a-z,0-9 and
some extra characters). This is also true for the rows in the table.
Instead you can try to run a table compression algorithm where in you
have an extra level of indirection but its very memory efficient. You
can either do row level compression, column level compression, or
both. You have the table as defined earlier but the width is not
equal to the complete character set. Instead you allocate a column of
the table for those columns that are different from others that have
been created as of yet. If a column generated is already present
then, map it to the already existing one in the index table. For row
level compression, you can use a partioning algorithm to eliminate
redundant states in the DFA generated.


I hope this helps.
Murali.
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.