Having trouble converting composite variables to intermediate form

noitalmost <noitalmost@cox.net>
Sat, 22 Feb 2014 12:32:01 -0500

          From comp.compilers

Related articles
Having trouble converting composite variables to intermediate form noitalmost@cox.net (noitalmost) (2014-02-22)
Re: Having trouble converting composite variables to intermediate form momchil.velikov@gmail.com (Momchil Velikov) (2014-02-28)
Re: Having trouble converting composite variables to intermediate form noitalmost@cox.net (noitalmost) (2014-03-10)
| List of all articles for this month |
From: noitalmost <noitalmost@cox.net>
Newsgroups: comp.compilers
Date: Sat, 22 Feb 2014 12:32:01 -0500
Organization: Compilers Central
Keywords: question
Posted-Date: 22 Feb 2014 14:42:43 EST

Quick overview:
I've been working on a Pascal-like language called Wirl, which includes an ILL
called Mezzo.
WirlSrc --> WirlAST --> MezzoAST --> Tripl --> binary
                                                                                | --> CSrc
where MezzoAST can produce Tripl (i.e. 3-address code) or low-level C.
Currently, only the C target has been developed to any significant degree.
Mezzo syntax is like a low-level Pascal.


So as not to bore anyone with the details of Mezzo, I'll translate the problem
as going from Wirl to low-level C. That is, a subset of C with no typedefs,
structs, or unions. Also, only very simple expressions are allowed. It's not
as restrictive as 3-addr code, but close. In particular, built-up expressions
with parentheses aren't allowed.


The problem:
var :
        r : record
                        n : int;
                        a : array[3] of int;
                  end;
        x : int;
end;


x := r.a[2] + 5;


The compiler is implemented in C++. Below are some relevant classes. Many
details have been omitted. For brevity, consider all members as public.


class TypeDef {
// a data type
string name();
};


class ArrayType : public TypeDef {
TypeDef* elemType();
Expr* elemCnt(); // element count
};


class AstNode {
virtual void genMezzo(MezzoNode *parent);
};


class Assign : public AstNode {
Expr* lhs();
Expr* rhs();
};


class Expr : public AstNode {
virtual MezzoExpr* genMezzoExpr(MezzoNode *parent);
TypeDef* resultType();
};


class Ident : public Expr {
VarDef* varDef(); // variable definition that we reference
// result type is type of the variable
};


class Op : public Expr {
char optype();
Expr* lhs(); // left-hand side
Expr* rhs(); // right-hand side
};


class RecAccess : public Expr {
Expr* record();
Ident* field(); // the field we want to access
// result type is type of field
};


class ArrAccess : public Expr {
Expr* array();
Expr* idx(); // index
// result type is array element type
};




After parsing, the Wirl AST fragment looks like:
Assign
    Ident x
    Op +
        RecAccess
            Ident r
            ArrAccess
                Ident a
                Num 2


The trouble arises in the Mezzo generation, which traverses the Wirl AST
through genMezzo(MezzoNode *parent), which adds Mezzo nodes to parent.


void Assign::genMezzo(MezzoNode *parent) {
// in the current case, parent is the statement list belonging
// to the Mezzo program node


MezzoExpr *mlhs = lhs()->genMezzoExpr(parent);
MezzoExpr *mrhs = rhs()->genMezzoExpr(parent);


parent->addStmt( new MezzoAssign(mlhs, mrhs) );
}


MezzoNode* Op::genMezzoExpr(MezzoNode *parent) {
MezzoExpr *mlhs = lhs()->genMezzoExpr(parent);
MezzoExpr *mrhs = rhs()->genMezzoExpr(parent);


return new MezzoOp('+', mlhs, mrhs);
}


MezzoNode* RecAccess::genMezzoExpr(MezzoNode *parent) {
// Mezzo has no record type, so we have to use ptr access


MezzoExpr *mrec = record()->genMezzoExpr(parent);
MezzoExpr *mfield = field()->genMezzoExpr(parent);


// have to return addr of field?
}


MezzoNode* ArrAccess::genMezzoExpr(MezzoNode *parent) {
// Mezzo has array of fundamental types only
// so sometimes, we could return a value
// but if we have composite elem type, we have to return an address


// In the current case, we have to return an address
// because parent is a record


// Help, help. I'm in a mess!
}


For brevity, I've omitted the checks for complicated expressions that require
generation of temporaries, sanity checks, and the like.


Using C syntax for the Mezzo code, and assuming 4-byte ints, it should
generate something like:
#define IntSz 4
#define T0 (IntSz)
#define T1 (4 * 4)
char r[T0 + T1];
int x;


int main {
    char *T2 = r;
    int T3 = IntSz; /* offset of a from r */
    int T4 = IntSz * 2; /* offset from a */
    int T5 = T3 + T4; /* offset from r */
    char *T6 = T2 + T5; /* addr we're looking for */
    T7 = (int*)T6;
    T8 = *T7;
    x = T8 + 5;


    return 0;
}


Now how can I make my compiler do that?
The trouble I'm having has to do with the cast (in this case to int*) and the
following dereference. I don't know where to put them to get the correct
result. My code works okay for situations like
var r : record x,y : int end;
var a : array[7] of float;
The problem is when there's a combination of arrays and records.


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.