Having trouble converting composite variables to intermediate form

noitalmost <noitalmost@cox.net>
Sat, 22 Feb 2014 12:32:01 -0500

          From comp.compilers

Related articles
Having trouble converting composite variables to intermediate form noitalmost@cox.net (noitalmost) (2014-02-22)
Re: Having trouble converting composite variables to intermediate form momchil.velikov@gmail.com (Momchil Velikov) (2014-02-28)
Re: Having trouble converting composite variables to intermediate form noitalmost@cox.net (noitalmost) (2014-03-10)
| List of all articles for this month |

From: noitalmost <noitalmost@cox.net>
Newsgroups: comp.compilers
Date: Sat, 22 Feb 2014 12:32:01 -0500
Organization: Compilers Central
Keywords: question
Posted-Date: 22 Feb 2014 14:42:43 EST

Quick overview:
I've been working on a Pascal-like language called Wirl, which includes an ILL
called Mezzo.
WirlSrc --> WirlAST --> MezzoAST --> Tripl --> binary
                                                                                | --> CSrc
where MezzoAST can produce Tripl (i.e. 3-address code) or low-level C.
Currently, only the C target has been developed to any significant degree.
Mezzo syntax is like a low-level Pascal.

So as not to bore anyone with the details of Mezzo, I'll translate the problem
as going from Wirl to low-level C. That is, a subset of C with no typedefs,
structs, or unions. Also, only very simple expressions are allowed. It's not
as restrictive as 3-addr code, but close. In particular, built-up expressions
with parentheses aren't allowed.

The problem:
var :
        r : record
                        n : int;
                        a : array[3] of int;
        x : int;

x := r.a[2] + 5;

The compiler is implemented in C++. Below are some relevant classes. Many
details have been omitted. For brevity, consider all members as public.

class TypeDef {
// a data type
string name();

class ArrayType : public TypeDef {
TypeDef* elemType();
Expr* elemCnt(); // element count

class AstNode {
virtual void genMezzo(MezzoNode *parent);

class Assign : public AstNode {
Expr* lhs();
Expr* rhs();

class Expr : public AstNode {
virtual MezzoExpr* genMezzoExpr(MezzoNode *parent);
TypeDef* resultType();

class Ident : public Expr {
VarDef* varDef(); // variable definition that we reference
// result type is type of the variable

class Op : public Expr {
char optype();
Expr* lhs(); // left-hand side
Expr* rhs(); // right-hand side

class RecAccess : public Expr {
Expr* record();
Ident* field(); // the field we want to access
// result type is type of field

class ArrAccess : public Expr {
Expr* array();
Expr* idx(); // index
// result type is array element type

After parsing, the Wirl AST fragment looks like:
    Ident x
    Op +
            Ident r
                Ident a
                Num 2

The trouble arises in the Mezzo generation, which traverses the Wirl AST
through genMezzo(MezzoNode *parent), which adds Mezzo nodes to parent.

void Assign::genMezzo(MezzoNode *parent) {
// in the current case, parent is the statement list belonging
// to the Mezzo program node

MezzoExpr *mlhs = lhs()->genMezzoExpr(parent);
MezzoExpr *mrhs = rhs()->genMezzoExpr(parent);

parent->addStmt( new MezzoAssign(mlhs, mrhs) );

MezzoNode* Op::genMezzoExpr(MezzoNode *parent) {
MezzoExpr *mlhs = lhs()->genMezzoExpr(parent);
MezzoExpr *mrhs = rhs()->genMezzoExpr(parent);

return new MezzoOp('+', mlhs, mrhs);

MezzoNode* RecAccess::genMezzoExpr(MezzoNode *parent) {
// Mezzo has no record type, so we have to use ptr access

MezzoExpr *mrec = record()->genMezzoExpr(parent);
MezzoExpr *mfield = field()->genMezzoExpr(parent);

// have to return addr of field?

MezzoNode* ArrAccess::genMezzoExpr(MezzoNode *parent) {
// Mezzo has array of fundamental types only
// so sometimes, we could return a value
// but if we have composite elem type, we have to return an address

// In the current case, we have to return an address
// because parent is a record

// Help, help. I'm in a mess!

For brevity, I've omitted the checks for complicated expressions that require
generation of temporaries, sanity checks, and the like.

Using C syntax for the Mezzo code, and assuming 4-byte ints, it should
generate something like:
#define IntSz 4
#define T0 (IntSz)
#define T1 (4 * 4)
char r[T0 + T1];
int x;

int main {
    char *T2 = r;
    int T3 = IntSz; /* offset of a from r */
    int T4 = IntSz * 2; /* offset from a */
    int T5 = T3 + T4; /* offset from r */
    char *T6 = T2 + T5; /* addr we're looking for */
    T7 = (int*)T6;
    T8 = *T7;
    x = T8 + 5;

    return 0;

Now how can I make my compiler do that?
The trouble I'm having has to do with the cast (in this case to int*) and the
following dereference. I don't know where to put them to get the correct
result. My code works okay for situations like
var r : record x,y : int end;
var a : array[7] of float;
The problem is when there's a combination of arrays and records.

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.