Related articles |
---|
Optimizing questions gclind01@starbase.spd.louisville.edu (1996-08-24) |
Re: Optimizing questions cliffc@ami.sps.mot.com (1996-08-27) |
Optimizing questions lts@cs.rice.edu (Taylor Simpson) (1996-08-27) |
From: | cliffc@ami.sps.mot.com (Cliff Click) |
Newsgroups: | comp.compilers |
Date: | 27 Aug 1996 23:52:35 -0400 |
Organization: | none |
References: | 96-08-078 |
Keywords: | optimize |
gclind01@starbase.spd.louisville.edu (George C. Lindauer) writes:
> Another thing is, one day I coded a bunch of routines to make dags out of the
> icode for basic blocks. Then I was aghast at the code it took, and I was
> aghast at the run-time memory requirements. The only nice thing I could
> find about it was once the dag is constructed you don't have to go
> node-searching. Unfortunately to construct the dag you DO have to go node
> searching.
Building a DAG takes linear time and is very fast. It takes very
little code as well. I suspect your implementation could be improved.
> I eventually gave up on dags and settled on a linear search
> through the basic block to detect common subexpressions. Less code, less
> memory, and hopefully only a little more time. Does anyone have a feel for
> the relative time for a linear search for reuse versus a search through the
> DAG during dag creation?
"Linear searching" will take O(n^2) time (where n is the size of the
basic block). DAG building gives you use-def chains, which removes
the extra factor of n. This lets local common subexpression
elimination run in linear time as well. Like DAG building, it can be
made to run very fast. You can collapse the two passes over your
basic block, and do DAG building, CSE, constant folding & algebraic
identity handling in a single pass.
You need a hash table mapping names to nodes. As we make our 1 pass
through the block, this hash table will map a name to the most recent
definition of the name. Return NULL on a miss, or the node pointer on
a hit. Make the table about 25% larger the number of instructions in
the block, then round up to a power of two. Probe with any odd
number. You should have no trouble tuning this to get an average
number of probes per lookup less than 1.2 (you always need 1).
You need a hash table mapping "value number" to nodes. The key in
this case is opcode+input-pointers and the result is a node with that
opcode and input pointers. This table lets you find CSE's quickly.
The requirement for single-pass puts strong restrictions on how we
order the steps done on each instruction. If I is an instruction then
I has I.cnt operands, each referenced as I[0], I[1], ..., and I has
opcode I.op. I's name is I.def. Constants are treated as nil-ary
functions, so each individual constant is it's own instruction and
opcode. Thus "3+x+5" is: "t1 = 3; t2 = t1+x; t3 = 5; t4= t2+t3;".
Feel free to fold the constant instructions into machine-dependent
forms with the constants in the instructions.
forall ptrs to instructions I do:
for i=0 to I->cnt do:
/* Find defining node for each operand name */
nodep = lookup_names_to_nodes[I[i]]
if( !nodep ) /* not in name table? */
/* value is live-in, make a place-holder */
nodep = create_bogus_live_in_node(I[i])
I[i] = nodep; /* replace names with nodes */
enddo
if( all I[x] operands have constant opcodes )
I->op = constant_fold(I) /* replace with a constant opcode */
switch( I->op ) { /* handle special-case algebraic identities */
case op_add:
if( I[0]->op == op_zero ) /* add of a zero? */
I = I[1]; /* I becomes pointer to I[1] */
else if( I[1]->op == op_zero ) /* add of a zero other way? */
I = I[0]; /* I becomes pointer to I[0] */
break;
case op_sub:
if( I[0] == I[1] ) /* subtract of equivalents? */
I.op = op_zero; /* always zero */
break;
case ... /* skillion other interesting cases */
default: break; /* do nothing special */
}
/* value_numer I */
tmp = lookup_value_number[ hash(I.op,I[0],I[1],...) ];
if( tmp ) /* hit in table! */
I = tmp; /* use prior instruction already in table */
else /* Else missed, insert new value number */
insert_value_number( hash(I.op,I[0],I[1],...), I );
/* map defined name to node producing value */
insert_names_to_nodes( I[i], I.def );
enddo;
Here the DAG is built and we've done all manor of optimizations.
If we want names back we follow the use-def chains and read the
def field of the input instruction.
There are endless further opportunities here; this is a real
bare-bones pass. But it should be extremely fast for you.
Cliff
--
Cliff Click, Ph.D. Compiler Researcher & Designer
RISC Software, Motorola PowerPC Compilers
cliffc@risc.sps.mot.com (512) 891-7240
http://members.aol.com/mjclick1
--
Return to the
comp.compilers page.
Search the
comp.compilers archives again.