Re: YACC, going the other way

jimad@microsoft.UUCP (Jim ADCOCK)
26 Apr 91 19:35:11 GMT

          From comp.compilers

Related articles
YACC, going the other way elk@cblpn.att.com (1991-04-15)
Re: YACC, going the other way zeil@cs.odu.edu (1991-04-23)
Re: YACC, going the other way carlton@aldebaran.Berkeley.EDU (1991-04-23)
Re: YACC, going the other way wunder@hpsdel.sde.hp.com (Walter Underwood) (1991-04-24)
Re: YACC, going the other way zvr@ntua.gr (1991-04-25)
Re: YACC, going the other way jimad@microsoft.UUCP (1991-04-26)
Re: YACC, going the other way ressler@cs.cornell.edu (1991-05-01)
| List of all articles for this month |
Newsgroups: comp.compilers
From: jimad@microsoft.UUCP (Jim ADCOCK)
Keywords: yacc, testing
Organization: Microsoft Corp., Redmond WA
References: <1991Apr23.140427.5416@iecc.cambridge.ma.us>
Date: 26 Apr 91 19:35:11 GMT

In article <1991Apr23.140427.5416@iecc.cambridge.ma.us> elk@cblpn.att.com (Edwin Lewis King +1 614 860 3394) writes:
>I'm interesting in generating strings that are described by a BNF (OK,
>YACC) grammar.


I don't know aobut "cleanly and efficiently", but, in theory, it's a lot
easier to generate examples of strings from a grammar than to recognize
them. Below is a fragment out of a program I wrote to generate "C++" -
like strings [I wrote one version of the program based on Stroustrup's
published grammar, and one version based on Roskind's]


The only mysterious part of the fragment is the switch statements


switch(W0) and
switch(R5)


"W0" for example is a macro that expands to randomly generate a 1/0 int
-- weighted to more likely choose 0 than 1.


"R5" is a macro that expands to one of the numbers 0-5, equally weighted.


The real problem in randomly generating strings like this is that the
production rules tend to form "feedback loops" such that some productions
are much more likely to be produced repeatedly than others. Hence the
hand generated "tweaks" to try to weight some cases more than others.


Also, remember than generating some strings based on grammars used in C++
compilers is a very very different thing than trying to automatically
generate "C++ code." Only part of the problem relates to the use of
fed-back type info in C/C++. Here's some examples of such productions:


....


/***********************/
struct TSH ;


/***********************/
volatile & Tq ::
  Tm :: operator
  Tm * const ( ) { auto operator
  Tr , ih ; struct Tr extern typedef
; int io ( ( signed )
- :: new float * ++ - ( const volatile
  Td
volatile
) short
( ~ +
( volatile ) -- sizeof ++ --
sizeof
sizeof
sizeof
& TN ( * sizeof
( void short
void


....


----------- a fragment from the producing program:






....


void conditional_expression()
{
switch(W0)
{
case 0: logical_or_expression(); break;
case 1:
logical_or_expression();
Q();
expression();
COLON();
conditional_expression();
break;
}
}


void logical_or_expression()
{
switch(W0)
{
case 0: logical_and_expression(); break;
case 1:
logical_or_expression();
OROR();
logical_and_expression();
break;
}
}


void logical_and_expression()
{
switch(W0)
{
case 0: inclusive_or_expression(); break;
case 1:
logical_and_expression();
ANDAND();
inclusive_or_expression();
break;
}
}


void inclusive_or_expression()
{
switch(W0)
{
case 0: exclusive_or_expression(); break;
case 1:
inclusive_or_expression();
OR();
exclusive_or_expression();
break;
}
}


void exclusive_or_expression()
{
switch(W0)
{
case 0: and_expression(); break;
case 1:
exclusive_or_expression();
XOR();
and_expression();
break;
}
}


void and_expression()
{
switch(W0)
{
case 0: equality_expression(); break;
case 1:
and_expression();
AND();
equality_expression();
break;
}
}


void equality_expression()
{
switch(R5)
{
case 0:
case 3:
case 4:
relational_expression(); break;
case 1:
equality_expression();
EQEQ();
relational_expression();
break;
case 2:
equality_expression();
NOTEQ();
relational_expression();
break;
}
}


....
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.