Re: tips for writing regular expression interpreter/compiler?

"mefrill" <mefrill@yandex.ru>
2 Dec 2005 20:23:11 -0500

From comp.compilers

Related articles
tips for writing regular expression interpreter/compiler? derekhaas@gmail.com (Derek Haas) (2005-11-26)
Re: tips for writing regular expression interpreter/compiler? jburgy@gmail.com (2005-11-30)
Re: tips for writing regular expression interpreter/compiler? rsc@swtch.com (Russ Cox) (2005-12-02)
Re: tips for writing regular expression interpreter/compiler? jeffrey.kenton@comcast.net (Jeff Kenton) (2005-12-02)
*Re: tips for writing regular expression interpreter/compiler? mefrill@yandex.ru (mefrill)* (2005-12-02)**
Re: tips for writing regular expression interpreter/compiler? markwh04@yahoo.com (2005-12-23)
Re: tips for writing regular expression interpreter/compiler? markwh04@yahoo.com (2005-12-23)
Re: tips for writing regular expression interpreter/compiler? rsc@swtch.com (Russ Cox) (2005-12-23)

| List of all articles for this month |

From:	"mefrill" <mefrill@yandex.ru>
Newsgroups:	comp.compilers
Date:	2 Dec 2005 20:23:11 -0500
Organization:	http://groups.google.com
References:	05-11-119
Keywords:	lex
Posted-Date:	02 Dec 2005 20:23:11 EST

re --> re_list
re_list --> re_term re_list | re_term
re_term --> re_atom | re_alternation | re_kleene
re_alternation --> re_atom | re_atom
re_kleene --> re_atom '*'
re_atom --> 'one of simbols except '*' symbol and '|' symbol'

By basing on this grammar you may build the parser. Define separate
routine for each grammar's nonterminal. So, you will have re(),
re_list(), re_term(), re_alternation(), re_kleene() and re_atom()
functions. You should call the functions according to the grammar's
rules and only if th next symbol is suited. I do not like to talk about
frst and follow sets here, but you have to understand simple thing:

for example, in re_term function:

bool re_term()
{
      bool result = re_atom();

      if( result )
      {
            char symbol = look_next_symbol();
            if( symbol == '|' ) // we have alternation
            {
                  result = re_alternation();
            }
            else if( symbol == '*' ) // kleene construction
            {
                  result = re_kleene();
            }

      return result;
}

bool re_alternation()
{
      char symbol = get_next_symbol();
      if( symbol != '|' ) return false;

      bool result = re_atom();
      return result;
}

bool re_kleene()
{
      char symbol = get_next_symbol();
      if( symbol != '*' ) return false;

      return true;
}

bool re_atom()
{
      char symbol = get_next_symbol();
      if( symbol == '*' || symbol == '|' ) return false; // working
symbols cannot used as atoms

      return true;
}

Of course, you will need to make this grammar more complex, but the
approach is the same.

Vladimir.

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.

Re: tips for writing regular expression interpreter/compiler?

"mefrill" <mefrill@yandex.ru>2 Dec 2005 20:23:11 -0500

"mefrill" <mefrill@yandex.ru>
2 Dec 2005 20:23:11 -0500