Re: multi-language parsing by using yacc

Cees Visser <ctv@cs.vu.nl>
Mon, 21 Aug 1995 13:22:20 GMT

          From comp.compilers

Related articles
multi-language parsing by using yacc pliang@msmail4.HAC.COM (Peter Liang) (1995-08-13)
Re: multi-language parsing by using yacc simmons@bnr.ca (steve (s.s.) simmons) (1995-08-17)
Re: multi-language parsing by using yacc erik@kroete2.freinet.de (1995-08-21)
Re: multi-language parsing by using yacc ctv@cs.vu.nl (Cees Visser) (1995-08-21)
Re: multi-language parsing by using yacc bobduff@world.std.com (1995-08-21)
multi-language parsing by using yacc 75066.3204@CompuServe.COM (Carl Barron) (1995-08-22)
| List of all articles for this month |
Newsgroups: comp.compilers
From: Cees Visser <ctv@cs.vu.nl>
Keywords: yacc
Organization: Compilers Central
References: 95-08-097
Date: Mon, 21 Aug 1995 13:22:20 GMT

  :>
  :> In other words, can I create multi-instances of parsers by using yacc?
  :>


Yes. If you've already decided to use lex/yacc or flex/bison you can
create multiple instances of a scanner or parser ((without the need to
know about the scanner/parser specific internal mechanisms of these
tools, only C++ issues)).


What should be done for a scanner or parser is the following:


  (1) Look at the global variables in the _generated_ *.c output files
that have to be unique for (multiple) parallel running instances
of a scanner or parser, i.e. variables that need to be local to a new
C++ scanner/parser class.


  (2) Write a very simple (f)lex script that removes these variable
definitions from the generated *.c files automatically and put these
variable definitions in a new Scanner/Parser class (header file).


  (3) Also, let this script remove the type declarations from the
generated *.c files. These type declarations should be added to the
corresponding new Scanner/Parser class.


  (4) Some global arrays (in the lex/yacc case) should be declared
static by this script.


The lex/yacc versions require a bit more work than the flex/bison
versions.


I've used lex/yacc in the past. Currently, flex/bison. This simple
approach works well for both scanner/parser combinations.


The (f)lex script for turning the Bison output automatically into a
re-usable file is something like this:


  ^"int[ \t]+yychar;" { /* remove ... */ }
  ^"YYSTYPE[ \t]+yylval;" { /* remove ... */ }
  ^"YYLTYPE[ \t]+yylloc;" { /* remove ... */ }
  yylex { printf ("%s::yylex", scannername); }
  ^"int[ \t]+yynerrs;" { /* remove ... */ }
  ^"int[ \t]+yydebug;" { /* remove ... */ }
  yyparse { printf ("%s::parse", parsername); }


The corresponding script to turn flex output automatically into a
re-usable class that can run in parallel with other scanners requires
more cut/paste work.


This approach allows you to have multiple scanner/parser combinations
for a single language system as well as for multiple language systems
to run concurrently within a single program context.


To give you an idea of a possible Parser class, I've appended a _sketch_
of a Parser class definition.


In a real program you probably want to have 4 classes: 1 class for the
aspects that are 100 percent related to the (yacc/bison) parser, 1 for
the (lex/flex) scanner, 1 for application specific parser issues, and 1
for application specific scanner issues. The inheritance scheme is in
this case like:


Parser::ApplicParser::Scanner::ApplicScannner


to give you a maximum of flexibility and re-usability.


--ctv


  ========= ========= ========= ========= ========= =========


Parser class example:


#ifndef SYNSCAN_H
#define SYNSCAN_H 1


class Parser : public Scanner {
private:


      // YYSTYPE yylval; IN LEXICAL SCANNER
      // YYLTYPE yylloc;


      int yychar;
      int yydebug;
      int yynerrs;


public:


      Parser (FILE *infile=0) : Scanner (infile)
      {
            yydebug = 0;
      }


      void params (int argc, char **argv) {
            int argi = argc;


            // ....
      }


      int parse ();


      int start ()
      {
            return parse ();
      }


      void yyerror (char *msg)
      {
            char *msg_not_used = msg;
            fprintf (stderr, ">>> line %4d: syntax error\n", yy_line_count);
            fprintf (stderr, ">>> line %4d: near token : >>%s<<\n", yy_line_count, yytext);


            // ....
      }
};


#endif


--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.