Re: compiling case insensitive regular expressions

"Armel" <armelasselin@hotmail.com>
Thu, 4 Nov 2010 13:03:36 +0100

          From comp.compilers

Related articles
compiling case insensitive regular expressions armelasselin@hotmail.com (Armel) (2010-11-01)
Re: compiling case insensitive regular expressions gah@ugcs.caltech.edu (glen herrmannsfeldt) (2010-11-03)
Re: compiling case insensitive regular expressions benhanson2@icqmail.com (2010-11-03)
Re: compiling case insensitive regular expressions armelasselin@hotmail.com (Armel) (2010-11-04)
Re: compiling case insensitive regular expressions rsc@swtch.com (Russ Cox) (2010-11-04)
Re: compiling case insensitive regular expressions gah@ugcs.caltech.edu (glen herrmannsfeldt) (2010-11-05)
Re: compiling case insensitive regular expressions cr88192@hotmail.com (BGB) (2010-11-06)
| List of all articles for this month |
From: "Armel" <armelasselin@hotmail.com>
Newsgroups: comp.compilers
Date: Thu, 4 Nov 2010 13:03:36 +0100
Organization: les newsgroups par Orange
References: 10-11-004 10-11-006
Keywords: lex
Posted-Date: 04 Nov 2010 22:11:36 EDT

"glen herrmannsfeldt" <gah@ugcs.caltech.edu> a icrit dans le message de
> Armel <armelasselin@hotmail.com> wrote:
>> I need to compile regular expressions which are case insensitive,
>> [..]
> Another way [...] is to supply a bit mask for
> each character being compared. Only bits with a '1' in the mask are
> used in the comparison.


I used that technic to build a LL(1) based Z80 disassembler/decompiler years
ago, to decode "structural bits" (vs. values bits) in instructions, but I
don't think its applicable to a AFD-based regular expression engine, before
knowing which bits can be ignored the AFD will first need to determine a
class for the character. it will simply result in a "tolower(c)" (or
toupper(c) ) called on each character of the input, but expressed and coded
in another way. The preparation phase still has to verify that there are no
two paths leaving a state with 'a' and 'A' for example (as they would become
a single path once the mask applied).


it would advocate in favor of the tolower( ) solution.


Regards
Armel



Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.