Related articles |
---|
Learning only one lexer made me blind to its hidden assumptions costello@mitre.org (Roger L Costello) (2022-07-07) |
Re: Learning only one lexer made me blind to its hidden assumptions luser.droog@gmail.com (luser droog) (2022-07-12) |
Re: Learning only one lexer made me blind to its hidden assumptions jvilar@uji.es (Juan Miguel Vilar Torres) (2022-07-13) |
Re: Learning only one lexer made me blind to its hidden assumptions drikosev@gmail.com (Ev. Drikos) (2022-07-13) |
Re: Learning only one lexer made me blind to its hidden assumptions antispam@math.uni.wroc.pl (2022-07-13) |
Re: Learning only one lexer made me blind to its hidden assumptions gneuner2@comcast.net (George Neuner) (2022-07-14) |
Re: Learning only one lexer made me blind to its hidden assumptions 480-992-1380@kylheku.com (Kaz Kylheku) (2022-07-15) |
Re: Learning only one lexer made me blind to its hidden assumptions antispam@math.uni.wroc.pl (2022-07-15) |
From: | "Ev. Drikos" <drikosev@gmail.com> |
Newsgroups: | comp.compilers |
Date: | Wed, 13 Jul 2022 14:58:50 +0300 |
Organization: | Aioe.org NNTP Server |
References: | 22-07-006 |
Injection-Info: | gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="78718"; mail-complaints-to="abuse@iecc.com" |
Keywords: | lex, history |
Posted-Date: | 13 Jul 2022 11:23:44 EDT |
Content-Language: | en-US |
On 07/07/2022 20:49, Roger L Costello wrote:
> ...
> Difference:
> - Flex allows overlapping regexes. It is up to Flex to use the 'correct'
> regex. Flex has rules for picking the correct one: longest match wins, regex
> listed first wins.
> - ScanGen does not allow overlapping regexes. Instead, you create one regex
> and then, if needed, you create "Except" clauses. E.g., the token is an
> Identifier, except if the token is 'Begin' or 'End' or 'Read' or 'Write'
>
> ...
As you can imagine there are many such options. A DFA builder may have
options a) to behave as Flex b) to treat only some tokens as reserved,
others as non reserved and c) to allow you examine shorter matches.
Who knows what else there is out there! (I don't claim to be an expert)
> Difference:
> - Flex deals with individual characters
> - ScanGen lumps characters into character classes and deals with classes. Use
> of character classes decreases (quite significantly) the size of the
> transition table
>
FYI, there is also a related controversial issue that may fire flames!
Bison also doesn't support character classes and this could be a reason
that scannerless parsing sounds weird to several people. Of course one
may use Bison down to the character level, but with many more states.
Also, if the grammar allows two consecutive identifiers, a lookahead
operator is likely necessary. (admittedly, a better alternative to
scannerless parsing may be different start states as supported by Flex).
When I played in the past with a scannerless GRL parser for SQL I hadn't
seen dramatic runtime slow downs with a few single/multi line commands.
Yet, I wouldn't try (or suggest) such an approach for XML processing.
> ...
Return to the
comp.compilers page.
Search the
comp.compilers archives again.