Related articles |
---|
A fast way to match regexp's and extract parts of them roberto@cernvax.cern.ch (1991-10-04) |
Re: A fast way to match regexp's and extract parts of them megatest!djones@decwrl.dec.com (1991-10-07) |
Re: A fast way to match regexp's and extract parts of them sra@ecs.soton.ac.uk (1991-10-08) |
Newsgroups: | comp.compilers |
From: | roberto@cernvax.cern.ch (roberto bagnara) |
Keywords: | lex, question, DFA |
Organization: | CERN, Geneva, Switzerland |
Date: | 4 Oct 91 09:46:43 GMT |
Hi all,
I'am facing the following problem.
DEFINITION (by examples):
------------------------
Let's call `extended regular expressions' a syntactic object of the form:
a{b*c}d{[0-9]*}
_{[A-Za-z_][A-Za-z0-9_]*}
where braces '{' and '}' are used to delimit regular sub-expressions.
THE PROBLEM
-----------
Let's have a set of such `extended regular expressions' (ERE), each of these
being associated to an `action'. That `action' has as many parameters
as there are '{}'-enclosed sub-expression in the associated ERE.
For example:
RE1) a{b*c}d{[0-9]*} action: foo($1, $2)
RE2) _{[A-Za-z_][A-Za-z0-9_]*} action: enter_priv_symbol($1)
I need an `acceptor' that, having recognized the occurrence of a string
matching an ERE, would "call" the associated `action' providing the
parameters extracted from the matching string.
For example, the input
"abbcd12" would result in a `call foo("bbc", "12")'.
For my application the execution speed of the acceptor is *very* important.
Now I'm using a DFA (to recognize patterns) and the GNU regexp package
(to extract the sub-pattern to be passed as parameters). This solution
is not as fast as I'd like it to be.
Finally, THE QUESTIONS ARE:
1) Is it possible to enrich a DFA with enough information to make it do
the job (partially, at least)?
In other words, is it possible to avoid analyzing the regular expressions
twice?
2) How?
3) If the answer to 1) is *no*, other ideas?
Thanks a lot
Roberto
P.S. Please respond by EMail, I'll summarize on request.
--
Return to the
comp.compilers page.
Search the
comp.compilers archives again.