Related articles |
---|
tips for writing regular expression interpreter/compiler? derekhaas@gmail.com (Derek Haas) (2005-11-26) |
Re: tips for writing regular expression interpreter/compiler? jburgy@gmail.com (2005-11-30) |
Re: tips for writing regular expression interpreter/compiler? rsc@swtch.com (Russ Cox) (2005-12-02) |
Re: tips for writing regular expression interpreter/compiler? jeffrey.kenton@comcast.net (Jeff Kenton) (2005-12-02) |
Re: tips for writing regular expression interpreter/compiler? mefrill@yandex.ru (mefrill) (2005-12-02) |
Re: tips for writing regular expression interpreter/compiler? markwh04@yahoo.com (2005-12-23) |
Re: tips for writing regular expression interpreter/compiler? markwh04@yahoo.com (2005-12-23) |
Re: tips for writing regular expression interpreter/compiler? rsc@swtch.com (Russ Cox) (2005-12-23) |
From: | "mefrill" <mefrill@yandex.ru> |
Newsgroups: | comp.compilers |
Date: | 2 Dec 2005 20:23:11 -0500 |
Organization: | http://groups.google.com |
References: | 05-11-119 |
Keywords: | lex |
Posted-Date: | 02 Dec 2005 20:23:11 EST |
re --> re_list
re_list --> re_term re_list | re_term
re_term --> re_atom | re_alternation | re_kleene
re_alternation --> re_atom | re_atom
re_kleene --> re_atom '*'
re_atom --> 'one of simbols except '*' symbol and '|' symbol'
By basing on this grammar you may build the parser. Define separate
routine for each grammar's nonterminal. So, you will have re(),
re_list(), re_term(), re_alternation(), re_kleene() and re_atom()
functions. You should call the functions according to the grammar's
rules and only if th next symbol is suited. I do not like to talk about
frst and follow sets here, but you have to understand simple thing:
for example, in re_term function:
bool re_term()
{
bool result = re_atom();
if( result )
{
char symbol = look_next_symbol();
if( symbol == '|' ) // we have alternation
{
result = re_alternation();
}
else if( symbol == '*' ) // kleene construction
{
result = re_kleene();
}
return result;
}
bool re_alternation()
{
char symbol = get_next_symbol();
if( symbol != '|' ) return false;
bool result = re_atom();
return result;
}
bool re_kleene()
{
char symbol = get_next_symbol();
if( symbol != '*' ) return false;
return true;
}
bool re_atom()
{
char symbol = get_next_symbol();
if( symbol == '*' || symbol == '|' ) return false; // working
symbols cannot used as atoms
return true;
}
Of course, you will need to make this grammar more complex, but the
approach is the same.
Vladimir.
Return to the
comp.compilers page.
Search the
comp.compilers archives again.