Re: YACC, going the other way

Walter Underwood <wunder@hpsdel.sde.hp.com>
Wed, 24 Apr 91 12:27:56 pdt

          From comp.compilers

Related articles
YACC, going the other way elk@cblpn.att.com (1991-04-15)
Re: YACC, going the other way zeil@cs.odu.edu (1991-04-23)
Re: YACC, going the other way carlton@aldebaran.Berkeley.EDU (1991-04-23)
Re: YACC, going the other way wunder@hpsdel.sde.hp.com (Walter Underwood) (1991-04-24)
Re: YACC, going the other way zvr@ntua.gr (1991-04-25)
Re: YACC, going the other way jimad@microsoft.UUCP (1991-04-26)
Re: YACC, going the other way ressler@cs.cornell.edu (1991-05-01)
| List of all articles for this month |

Newsgroups: comp.compilers
From: Walter Underwood <wunder@hpsdel.sde.hp.com>
Keywords: parse, testing
Organization: Compilers Central
References: <1991Apr23.140427.5416@iecc.cambridge.ma.us>
Date: Wed, 24 Apr 91 12:27:56 pdt

      I'm interesting in generating strings that are described by a BNF (OK,
      YACC) grammar.


A program to generate random sequences from a BNF grammar is described
in Chapter 17 of "The Icon Programming Language" by Griswold and
Griswold. An implementation of that comes with the Icon distribution.
It is called "rsg" for Random Sentence Generation. I've appended the
documentation from rsg.


Icon is available for anonymous FTP from cs.arizona.edu.


wunder


----------
############################################################################
#
# Name: rsg.icn
#
# Title: Generate randomly selected sentences from a grammar
#
# Author: Ralph E. Griswold
#
# Date: June 10, 1988
#
############################################################################
#
# This program generates randomly selected strings (``sen-
# tences'') from a grammar specified by the user. Grammars are
# basically context-free and resemble BNF in form, although there
# are a number of extensions.
#
# The program works interactively, allowing the user to build,
# test, modify, and save grammars. Input to rsg consists of various
# kinds of specifications, which can be intermixed:
#
# Productions define nonterminal symbols in a syntax similar to
# the rewriting rules of BNF with various alternatives consisting
# of the concatenation of nonterminal and terminal symbols. Gen-
# eration specifications cause the generation of a specified number
# of sentences from the language defined by a given nonterminal
# symbol. Grammar output specifications cause the definition of a
# specified nonterminal or the entire current grammar to be written
# to a given file. Source specifications cause subsequent input to
# be read from a specified file.
#
# In addition, any line beginning with # is considered to be a
# comment, while any line beginning with = causes the rest of that
# line to be used subsequently as a prompt to the user whenever rsg
# is ready for input (there normally is no prompt). A line consist-
# ing of a single = stops prompting.
#
# Productions: Examples of productions are:
#
# <expr>::=<term>|<term>+<expr>
# <term>::=<elem>|<elem>*<term>
# <elem>::=x|y|z|(<expr>)
#
# Productions may occur in any order. The definition for a nonter-
# minal symbol can be changed by specifying a new production for
# it.
#
# There are a number of special devices to facilitate the defin-
# ition of grammars, including eight predefined, built-in nontermi-
# nal symbols:
# symbol definition
# <lb> <
# <rb> >
# <vb> |
# <nl> newline
# <> empty string
# <&lcase> any single lowercase letter
# <&ucase> any single uppercase letter
# <&digit> any single digit
#
# In addition, if the string between a < and a > begins and ends
# with a single quotation mark, it stands for any single character
# between the quotation marks. For example,
#
# <'xyz'>
#
# is equivalent to
#
# x|y|z
#
# Generation Specifications: A generation specification consists of
# a nonterminal symbol followed by a nonnegative integer. An exam-
# ple is
#
# <expr>10
#
# which specifies the generation of 10 <expr>s. If the integer is
# omitted, it is assumed to be 1. Generated sentences are written
# to standard output.
#
# Grammar Output Specifications: A grammar output specification
# consists of a nonterminal symbol, followed by ->, followed by a
# file name. Such a specification causes the current definition of
# the nonterminal symbol to be written to the given file. If the
# file is omitted, standard output is assumed. If the nonterminal
# symbol is omitted, the entire grammar is written out. Thus,
#
# ->
#
# causes the entire grammar to be written to standard output.
#
# Source Specifications: A source specification consists of @ fol-
# lowed by a file name. Subsequent input is read from that file.
# When an end of file is encountered, input reverts to the previous
# file. Input files can be nested.
#
# Options: The following options are available:
#
# -s n Set the seed for random generation to n. The default
# seed is 0.
#
# -l n Terminate generation if the number of symbols remaining
# to be processed exceeds n. The default is limit is 1000.
#
# -t Trace the generation of sentences. Trace output goes to
# standard error output.
#
# Diagnostics: Syntactically erroneous input lines are noted but
# are otherwise ignored. Specifications for a file that cannot be
# opened are noted and treated as erroneous.
#
# If an undefined nonterminal symbol is encountered during gen-
# eration, an error message that identifies the undefined symbol is
# produced, followed by the partial sentence generated to that
# point. Exceeding the limit of symbols remaining to be generated
# as specified by the -l option is handled similarly.
#
# Caveats: Generation may fail to terminate because of a loop in
# the rewriting rules or, more seriously, because of the progres-
# sive accumulation of nonterminal symbols. The latter problem can
# be identified by using the -t option and controlled by using the
# -l option. The problem often can be circumvented by duplicating
# alternatives that lead to fewer rather than more nonterminal sym-
# bols. For example, changing
#
# <term>::=<elem>|<elem>*<term>
#
# to
#
# <term>::=<elem>|<elem>|<elem>*<term>
#
# increases the probability of selecting <elem> from 1/2 to 2/3.
#
# There are many possible extensions to the program. One of the
# most useful would be a way to specify the probability of select-
# ing an alternative.
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.