Re: ASTs and Type Check Analysis

"Bartc" <>
Mon, 25 Aug 2008 22:19:32 GMT

          From comp.compilers

Related articles
ASTs and Type Check Analysis (CranCran77) (2008-08-20)
Re: ASTs and Type Check Analysis (Hans-Peter Diettrich) (2008-08-24)
Re: ASTs and Type Check Analysis (2008-08-24)
Re: ASTs and Type Check Analysis (CranCran77) (2008-08-25)
Re: ASTs and Type Check Analysis (Bartc) (2008-08-25)
Re: ASTs and Type Check Analysis (Hans-Peter Diettrich) (2008-08-26)
| List of all articles for this month |

From: "Bartc" <>
Newsgroups: comp.compilers
Date: Mon, 25 Aug 2008 22:19:32 GMT
Organization: Compilers Central
References: 08-08-044 08-08-054 08-08-064
Keywords: types, analysis
Posted-Date: 26 Aug 2008 23:24:39 EDT

"CranCran77" <> wrote in message
> On Aug 23, 6:11 pm, Hans-Peter Diettrich <> wrote:
> In the simple example I provided, I know my end result (without
> optimizations) would result in the following:
> Had the code been optimized by the compiler at compile time, the
> resulting code would have been:
> PUSHI 20
> But I'm not at the "optimization" stage just yet; that is still to
> come.

This 'optimisation' is actually fairly easy and is worth doing
anyway. You need to do your type analysis first, then call something
like Evaluate() on each node, which for a binary op:

Evaluate(left operand)
Evaluate(right operand)
If left and right operands are now immediate values, do the calculation and
convert this node to a terminal.

> The goal presently is to understand how I should handle type
> assignments to nodes within the AST tree.
> As you see in my example, I had several nodes that were
> CASTIntegerLiteral, but I question if that is the right approach or
> not. In my opinion, the parser which is creating the AST tree should
> not be responsible for determining the number type of Integer, Real,
> or Long. A simple CASTNumberLiteral node seems more appropriate to me
> and allow the type analysis check phase when traversing the tree
> determine whether its Integer, Real, or Long right?

Depends on your implementation language, which may store all these
differently (but all my parsers differentiate between integer and real
literals). How would you know otherwise whether the value is integer or

> If my thought process is correct, then when my type check visitor
> traverses the nodes of the tree, certain AST nodes need to be
> decorated with type information, such as Integer, Real, or Long.

Is this Basic language dynamically typed with regard to numeric values?
Because if you're using tagged variables that makes things awkward, unless
you forget about integer and real and just have a Numeric type.

> When
> examining expressions such as binary ones, both sides to the operation
> need to be examined to make sure they both conform to the same type,
> and if not handle that appropriately.

This can get tricky for complex expressions. For binary ops, traverse each
operand subtree, leaving the required type open. Then choose the dominant
type of the two, and insert a cast if needed to one operand (or generate a
type error if required). This would also have been done at lower levels. But
it means that in:

3.5*4 + 5*6

which ends up as real, the 5*6 term uses integer arithmetic, in this case
not a problem, but you may prefer that all subops are treated as real.
Perhaps then re-traverse the right-hand subtree and require the type to be

> For example, "PRINT 2.5*3" would throw an exception with type mismatch

I don't think your users would be happy with that.

> because 3 would be evaluated to integer where-as 2.5 would be viewed
> as a real. A different method could be that it would turn the
> statement into "PRINT 2.5*3.0" so that both sides are viewed as real
> values for the computation to happen correctly.
> In this case during the binary expression type check, the right side
> of the operation would become the child of another another AST node
> that handles converting the result into a given type by an internal
> function call. So the tree would resemble the following:
> CASTFunctionCall
> +- CASTSimpleName
> +- Name: PRINT
> +- CASTExprList
> +- List Size: 1
> +- CASTBinaryExpr
> +- Operation: '*'
> +- CASTNumberInteger
> +- Number: 2.5
> +- CASTFunctionCall
> +- CASTSimpleName
> +- Name: INT2REAL
> +- CASTExprList
> +- List Size: 1
> +- CASTNumberLiteral
> +- Number: 3
> I suppose what I see is that I could use a class object called
> CASTType with derived classes for Integer, Long, Real, and String.
> Each derived class would contain information about the size of the
> type for storage purposes and/or reference pointer for strings in the
> constant pool.
> So each CASTNumberLiteral and CASTBinaryExpr objects would be assigned
> a CASTType object during the type check phase. Each CASTFunctionCall
> I also believe would get assigned a type as well in keeping with the
> assembly example above.
> Am I on the right track?

I put your PRINT 2.5*3 into one of my (rather ancient) compilers, and it
produced this equivalent to ASTs:

After parsing:
0005: Sysfn <> #: 1 [This is PRINT]
0005: Opc <> Mul
0005: Value <Real*8> [Val:Real*8] 2.500000
0005: Value <Int*1> [Val:Int*4] 3

After type analysis:
0005: Sysfn <> #: 1
0005: Opc <Real*8> Mul
0005: Value <Real*8> [Val:Real*8] 2.500000
0005: Value <Real*8> [Val:Real*8] 3.000000

After evaluating constants:
0005: Sysfn <> #: 1
0005: Value <Real*8> [Val:Real*8] 7.500000

No casts are used here, but using instead PRINT X*I (X is real, I is
integer) it gives:

After parsing:
0007: Sysfn <> #: 1
0007: Opc <> Mul
0007: Value <Real*8> [Val:Var](x 3C520AH)
0007: Value <Int*4> [Val:Var](i 3C5256H)

After type analysis:
0007: Sysfn <> #: 1
0007: Opc <Real*8> Mul
0007: Mem <Real*8> [Val:Var](x 3C520AH)
0007: Softconv <Real*8> [ie. a Cast]
0007: Mem <Int*4> [Val:Var](i 3C5256H)


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.