Tiger to C parser

the_hill_billy99@yahoo.com (jim)
28 Feb 2002 00:19:36 -0500

          From comp.compilers

Related articles
Tiger to C parser the_hill_billy99@yahoo.com (2002-02-28)
| List of all articles for this month |
From: the_hill_billy99@yahoo.com (jim)
Newsgroups: comp.compilers
Date: 28 Feb 2002 00:19:36 -0500
Organization: http://groups.google.com/
Keywords: translator
Posted-Date: 28 Feb 2002 00:19:36 EST

Need a few pointers on a little project im undertaking
its a tiger to c translator :




The translator takes a Tiger-- program as its input and either output
the equivalent program in C (which should be accepted by the GNU C
compiler - the C parser provided with JavaCC might also help you
here), or report the first instance of an error. The syntax of the
Tiger language has been modified for Tiger-- to make this translation
process more straightforward (in fact, many of the given example Tiger
programs are not syntactically correct according to this modified
syntax). A parser for Tiger-- can be found here*.(see very end) You
should extend this to include semantic actions which will generate the
equivalent C code. In addition to the extra syntactic constraints
which have been placed on the full Tiger language, there are some
additional semantic constraints as detailed below. If the semantic
errors in Tiger-- also constitute semantic errors in the translated
programs in C, then you can leave them for the GNU C compiler to
detect. Otherwise, you should flag these errors within your
translator.


Syntax of Tiger--
Prog ® let OuterDecList in ValuelessExpList end
ValuelessExp ® id(ArgList) | LValue := Exp | if LogicalOrExp then
ValuelessExp | if LogicalOrExp then ValuelessExp else ValuelessExp |
while LogicalOrExp do ValuelessExp | for id := AdditiveExp to
AdditiveExp do ValuelessExp | break | let InnerDecList in
ValuelessExpList end | (ValuelessExpList)
ValueExp ® RValue | let InnerDecList in ValueExpList end |
(ValueExpList)
OuterDecList ® OuterDec OuterDeclist | e
OuterDec ® TyDec | VarDec | FunDec
InnerDecList ® InnerDec InnerDecleist | e
InnerDec ® TyDec | VarDec
TyDec ® type id = Ty
Ty ® id | {FieldList} | array of id
VarDec ® var id:id:=Rvalue
FunDec ® function id(FieldList)=let ValuelessExp | function
id(FieldList):id=ValueExp
LValue ® id | LValue.id | LValue[Exp]
RValue ® id[num] of RValue | id{FieldExpList) | LogicalOrExp
LogicalOrExp ® LogicalAndExp | LogicalAndExp LogicalOrOp LogicalOrExp
LogicalAndExp ® RelationalExp | RelationalExp LogicalAndOp
LogicalAndExp
RelationalExp ® AdditiveExp | AdditiveExp RelationalOp AdditiveExp
AdditiveExp ® MultiplicativeExp | MultiplicativeExp AdditiveOp
AdditiveExp
MultiplicativeExp ® UnaryExp | UnaryExp MultiplicativeOp
MultiplicativeExp
UnaryExp ® PrimaryExp | UnaryExp | UnaryOp UnaryExp
PrimaryExp ® id(ArgList) | LValue | num | string | if LogicalOrExp
then RValue else RValue | (RValue)
ValuelessExpList ® ValuelessExp ValuelessExpList'
ValuelessExpList' ® ;ValuelessExp ValuelessExpList' | e
ValueExpList ® ValuelessExpList ; ValueExp | ValueExp
ArgList ® RValue ArgList' | e
ArgList' ® , RValue ArgList' | e
FieldList ® id:id FieldList' | e
FieldList' ® , id:id FieldList' | e
FieldExpList ® id = RValue FieldExpList' | e
FieldExpList' ® , id = RValue FieldExpList' | e
LogicalOrOp ® |
LogicalAndOp ® &
RelationalOp ® = | <> | < | <= | > | >=
AdditiveOp ® + | -
MultiplicativeOp ® * | /
UnaryOp ® -
Additional Notes on the Language Tiger--
Program Structure
A Tiger-- program consists of a let expression in which global
declarations are made, and the main program which consists of a series
of "valueless" expressions. Unlike in the full Tiger language, we
distinguish between expressions which return a value and those which
do not; we call these value expressions and valueless expressions
(these are normally called expressions and statements). This
facilitates the translation of Tiger-- into C.
A Tiger-- program with the following structure:


let OuterDecList in ValuelessExpList end
would be translated into the following in C:
(Translation of DecList)
int main()
{
      (Translation of ValuelessExpList)
}


Declarations
Tiger-- has three different kinds of declaration; type, variable and
function declaration. Type and variable declarations can be either
global or local, but function declarations can only be global. This is
to facilitate the translation into C, which does not allow nested
function definitions.
Type Declarations
The primitive data types in Tiger-- are int and string . These are
translated into the corresponding types int and char * in C.
User-defined data types in Tiger-- are translated into corresponding
typedef declarations in C. Unlike in the full Tiger language, there
are no dynamic data structures in Tiger--; thus the size of all data
structures must be known at compile-time. This places a few
restrictions on type declarations. Firstly, types cannot be recursive
or mutually recursive. Secondly, array types must be declared as
having a fixed size, unlike in the full Tiger language where the size
of an array is not determined until run-time. For example, the
following type declarations:
type type1 = int
type type2 = string
type type3 = array[3] of int
type type4 = array[10] of string
type type5 = {name:string, address:string, age: int}
type type6 = {person:type5, dates:type3}
type type7 = array[100] of type6
would be translated into the following in C;
typedef int type1;
typedef char * type2;
typedef int type3[3];
typedef char * type4[10];
typedef struct {char * name; char * address; int age;} type5;
typedef struct {type5 person; type3 dates;} type6;
typedef type6 type7[100];
Variable Declarations
Unlike in the full Tiger language, all variables in Tiger-- must be
given a type at declaration. Also, there is no implicit declaration of
loop variables; these must also be explicitly declared. All variables
must also be initialised at declaration. Arrays in Tiger-- are
initialised in the same way as arrays in the full Tiger language,
except that the dimension of the array in the initialisation must be a
constant. Unlike in the full Tiger language, records cannot be
initialised to nil; these must be initialised with an actual record
value. For example, the following variable declarations:
var a:type1 := 0
var b:type2 := ""
var c:type3 := type3[3] of 0;
var d:type5 := type5{name="aname", address="somewhere", age=0}
var e:type6 := type6{person=d,dates=type3[3] of 0}
would be translated into the following in C:
type1 a = 0;
type2 b = "";
type3 c = {0,0,0};
type5 d = {"aname","somewhere",0};
type6 e = {d,{0,0,0}};
Function Declarations
Functions in Tiger-- can be either procedures and functions. A
procedure has no specified return type, and its body is a valueless
expression. When translated into C, you should give procedures a
return type int. A function has a specified return type, and its body
is a value expression. This value expression must be returned using a
return statement when translated into C. You can assume that none of
the built-in functions of Tiger are included in Tiger--. For example,
the following function declarations in Tiger--:


function do_nothing1(a: int, b: string):int=
                (do_nothing2(a+1);0)
function do_nothing2(d: int) =
                do_nothing1(d, "str")


would be translated into the following in C:
int do_nothing1(int a,char * b)
{
      do_nothing2(a+1);
      return 0;
}
int do_nothing2(int d)
{
      do_nothing1(d,"str");
}


Valueless Expressions
Valueless expressions include procedure calls, assignments, if-then,
if-then-else, while and break. These can all be translated quite
straightforwardly into equivalent statements in C. For example, the
following valueless expression in Tiger--:
for r := 0 to N-1
      do if row[r]=0 & diag1[r+c]=0 & diag2[r+7-c]=0
            then (row[r]:=1;
                        diag1[r+c]:=1;
                        diag2[r+7-c]:=1;
                        col[c]:=r;
                        try(c+1);
                        row[r]:=0;
                        diag1[r+c]:=0;
                        diag2[r+7-c]:=0)
would be translated into the following in C:
for(r=0;r<N-1;r++)
      if (row[r]==0 && diag1[r+c]==0 && diag2[r+7-c]==0)
      {
            row[r] = 1;
            diag1[r+c] = 1;
            diag2[r+7-c] = 1;
            col[c] = r;
            try(c+1);
            row[r] = 0;
            diag1[r+c] = 0;
            diag2[r+7-c] = 0;
      }
Value Expressions
Value expressions include function calls, logical expressions,
relational expressions, arithmetic expressions, integers, strings,
record values, array values, lvalues and if-then-else expressions. The
translation of each of these into C is quite straightforward. Note
that within a value expression, if expressions must have an else .
These if expressions should be translated into conditional expressions
of the form e1 ? e2 : e3 in C. For example, the following value
expression in Tiger--:


if n = 0 then 1 else n * nfactor(n-1)


would be translated into the following in C:




n==0 ? 1 : n * nfactor(n-1)




*
options {
    JAVA_UNICODE_ESCAPE = true;
}


/********************************************/
/***** Standard parser class definition *****/
/********************************************/


PARSER_BEGIN(TigerParser)


public class TigerParser {


    public static void main(String args[]) {
        TigerParser parser;
        if (args.length == 0) {
            System.out.println("TigerParser: Reading from standard input .
. .");
            parser = new TigerParser(System.in);
        } else if (args.length == 1) {
            System.out.println("Tiger Parser: Reading from file " + args[0]
+ " . . .");
            try {
                parser = new TigerParser(new
java.io.FileInputStream(args[0]));
            } catch (java.io.FileNotFoundException e) {
                System.out.println("Tiger Parser: File " + args[0] + " not
found.");
                return;
            }


        } else {
            System.out.println("Tiger Parser: Usage is one of:");
            System.out.println(" java TigerParser < inputfile");
            System.out.println("OR");
            System.out.println(" java TigerParser inputfile");
            return;
        }
        try {
              parser.Prog();
              System.out.println("Tiger program parsed successfully.");
        } catch (ParseException e) {
              System.out.println(e.getMessage());
              System.out.println("Tiger parser: Encountered errors during
parse.");
        }
    }
}


PARSER_END(TigerParser)


/******************/
/***** TOKENS *****/
/******************/


TOKEN_MGR_DECLS :
{
    static int commentNesting = 0;
}


SKIP : /* WHITE SPACE */
{
    " "
| "\t"
| "\n"
| "\r"
| "\f"
}


SKIP : /* COMMENTS */
{
        "/*" { commentNesting++; } : IN_COMMENT
}


<IN_COMMENT> SKIP :
{
        "/*" { commentNesting++; }
    | "*/" { commentNesting--;
                      if (commentNesting == 0)
                            SwitchTo(DEFAULT);
                  }
    | <~[]>
}


TOKEN : /* RESERVED WORDS */
{
    < ARRAY : "array" >
| < BREAK : "break" >
| < DO : "do" >
| < ELSE : "else" >
| < END : "end" >
| < FOR : "for" >
| < FUNCTION : "function" >
| < IF : "if" >
| < IN : "in" >
| < LET : "let" >
| < NIL : "nil" >
| < OF : "of" >
| < THEN : "then" >
| < TO : "to" >
| < TYPE : "type" >
| < VAR : "var" >
| < WHILE : "while" >
}


TOKEN : /* OPERATORS */
{
    < AND : "&" >
| < ASSIGN : ":=" >
| < DIV : "/" >
| < EQUALS : "=" >
| < GT : ">" >
| < GTE : ">=" >
| < LT : "<" >
| < LTE : "<=" >
| < MINUS : "-" >
| < MULT : "*" >
| < NEQ : "<>" >
| < OR : "|" >
| < PLUS : "+" >
}


TOKEN : /* PUNCTUATION */
{
    < COLON : ":" >
| < COMMA : "," >
| < DOT : "." >
| < LBR : "(" >
| < LCURL : "{" >
| < LSQ : "[" >
| < RBR : ")" >
| < RCURL : "}" >
| < RSQ : "]" >
| < SEMIC : ";" >
}


TOKEN : /* IDENTIFIERS AND LITERALS */
{
    < ID : <LETTER> (<LETTER>|<DIGIT>|"_")* >
| < INTEGER : (<DIGIT>)+ >
| < #LETTER : ["A"-"Z", "a"-"z"] >
| < #DIGIT : ["0"-"9"] >
| < STRING : "\"" (<ESC>|<PRINTABLE_CHAR>|<SPLIT_LINE>)* "\"" >
| < #ESC : "\\" (
                                    ["n", "t", "\\", "\""]
                                | <DIGIT><DIGIT><DIGIT>
                                | "^"["a"-"z"]
                                ) >
| < #PRINTABLE_CHAR : ~["\\","\""] >
| < #SPLIT_LINE : "\\" ([" ","\t","\n","\r"])+ "\\" >
}


/*************************/
/***** TIGER GRAMMAR *****/
/*************************/


void Prog() : {}
{
    <LET> OuterDecList() <IN> ValuelessExpList() <END> <EOF>
}


void ValueExp() : {}
{
    <LET> InnerDecList() <IN> ValueExpList() <END>
| LOOKAHEAD(<LBR> ValuelessExp()) <LBR> ValueExpList() <RBR>
| RValue()
}


void ValuelessExp() : {}
{
    LOOKAHEAD(<ID> <LBR>) <ID> <LBR> [ArgList()] <RBR>
| LValue() <ASSIGN> RValue()
| <WHILE> LogicalOrExp() <DO> ValuelessExp()
| <IF> LogicalOrExp() <THEN> ValuelessExp() [LOOKAHEAD(1) <ELSE>
ValuelessExp()]
| <FOR> <ID> <ASSIGN> AdditiveExp() <TO> AdditiveExp() <DO>
ValuelessExp()
| <LET> InnerDecList() <IN> ValuelessExpList() <END>
| <LBR> ValuelessExpList() <RBR>
| <BREAK>
}


void RValue() : {}
{
    LOOKAHEAD(<ID> <LSQ> <INTEGER> <RSQ> <OF>) <ID> <LSQ> <INTEGER>
<RSQ> <OF> RValue()
| LOOKAHEAD(<ID> <LCURL>) <ID> <LCURL> FieldExpList() <RCURL>
| LogicalOrExp()
}


void LogicalOrExp() : {}
{
    LogicalAndExp() [LOOKAHEAD(1) <OR> LogicalOrExp()]
}


void LogicalAndExp() : {}
{
    RelationalExp() [LOOKAHEAD(1) <AND> LogicalAndExp()]
}


void RelationalExp() : {}
{
    AdditiveExp() [LOOKAHEAD(1) (<EQUALS> | <NEQ> | <GT> | <LT> | <GTE>
| <LTE>) AdditiveExp()]
}


void AdditiveExp() : {}


{
    MultiplicativeExp() [LOOKAHEAD(1) (<PLUS> | <MINUS>) AdditiveExp()]
}


void MultiplicativeExp() : {}
{
    UnaryExp() [LOOKAHEAD(1) (<MULT> | <DIV>) MultiplicativeExp()]
}


void UnaryExp() : {}
{
    [<MINUS>] PrimaryExp()
}


void PrimaryExp() : {}
{
    LOOKAHEAD(<ID> <LBR>) <ID> <LBR> [ArgList()] <RBR>
| LValue()
| <INTEGER>
| <STRING>
| <IF> LogicalOrExp() <THEN> RValue() <ELSE> RValue()
| <LBR> RValue() <RBR>
}


void LValue() : {}
{
    <ID> (<LSQ> AdditiveExp() <RSQ> | <DOT> <ID>)*
}


void ValueExpList() : {}
{
    (LOOKAHEAD(ValuelessExp() <SEMIC>) ValuelessExp() <SEMIC>)*
ValueExp()
}


void ValuelessExpList() : {}
{
    ValuelessExp() (<SEMIC> ValuelessExp())*
}


void ArgList() : {}
{
    RValue() (<COMMA> RValue())*
}


void FieldExpList() : {}
{
    <ID> <EQUALS> RValue() (<COMMA> <ID> <EQUALS> RValue())*
}


void OuterDecList() : {}
{
    (OuterDec())*
}


void InnerDecList() : {}
{
    (InnerDec())*
}


void OuterDec() : {}
{
    <TYPE> <ID> <EQUALS> Type()
| <VAR> <ID> <COLON> <ID> <ASSIGN> RValue()
| <FUNCTION> <ID> <LBR> FieldList() <RBR> ( <COLON> <ID> <EQUALS>
ValueExp() | <EQUALS> ValuelessExp())
}


void InnerDec() : {}
{
    <TYPE> <ID> <EQUALS> Type()
| <VAR> <ID> <COLON> <ID> <ASSIGN> RValue()
}


void Type() : {}
{
    <ID>
| <LCURL> FieldList() <RCURL>
| <ARRAY> <LSQ> <INTEGER> <RSQ> <OF> <ID>
}


void FieldList() : {}
{
    [<ID> <COLON> <ID> (<COMMA> <ID> <COLON> <ID>)*]
}


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.