Re: Graph coloring register allocation via iterative scan

Vladimir Makarov <vmakarov@redhat.com>
14 Feb 2006 10:23:13 -0500

          From comp.compilers

Related articles
Graph coloring register allocation via iterative scan avayvod@gmail.com (2006-02-03)
Re: Graph coloring register allocation via iterative scan robert.thorpe@antenova.com (Rob Thorpe) (2006-02-06)
Re: Graph coloring register allocation via iterative scan vmakarov@redhat.com (Vladimir Makarov) (2006-02-07)
Re: Graph coloring register allocation via iterative scan avayvod@gmail.com (Whywhat) (2006-02-11)
Re: Graph coloring register allocation via iterative scan vmakarov@redhat.com (Vladimir Makarov) (2006-02-14)
Re: Graph coloring register allocation via iterative scan sylerugby@yahoo.com (2006-02-17)
Re: Graph coloring register allocation via iterative scan Sid.Touati@inria.fr (Sid Touati) (2006-02-20)
Re: Graph coloring register allocation via iterative scan u.hobelmann@web.de (Ulrich Hobelmann) (2006-02-24)
Re: Graph coloring register allocation via iterative scan koes@cmu.edu (dkoes) (2006-03-11)
| List of all articles for this month |

From: Vladimir Makarov <vmakarov@redhat.com>
Newsgroups: comp.theory,comp.compilers,comp.programming
Date: 14 Feb 2006 10:23:13 -0500
Organization: Red Hat, Inc.
References: 06-02-01806-02-051 06-02-054 06-02-073
Keywords: registers
Posted-Date: 14 Feb 2006 10:23:13 EST

Whywhat wrote:
> Do you know some scientific researches on this comparison? Or there's
> only a common opinion?




That is a common opinion what I heard from specialists working in this
area. You can also find such opinion (about bin packing approach and
approach based on colouring conflict graph) in Bob Morgan's book
"Building optimizing compiler". As I understand iterative scan is
just improvement of linear scan to deal with irregular file
architectures (like x86). Comparison of linear scan and George Appel
approach can be find in original article about linear scan register
allocator.


IMHO all this comparison should be taken with a grain of salt. For
example, few years ago I read articles where one method
(Callahan-Koblenz approach) was shown better than Briggs approach, and
another article where it was shown as worse. About ten years ago, I
read very educated discussion with Briggs participation what is better
his approach or priority based colouring (I guess you can find in
comp.compilers archive). Personally I still don't know what is
better. The priority based colouring and Briggs optimistic colouring
can colour successfully classical diamond graph with two colours.
Briggs biased register assigning can be easily implemented in the
priority based colouring too (it could be a small project for GCC --
its global register allocator can not have biased colouring).


As I wrote there are a few reasons for comparison difficulty
(execution, compiler environment, target processor specifics).
Execution means how good and accurate the implementation of the
register was (a good implementation needs a lot of time which
researchers don't have in many cases).


Compiler environment means how aggressive optimizations in the
compiler itself (e.g. aggressive inlining creates more functions with
complex CFG and in this case graph colouring works better), does the
compiler have preliminary passes (like register pressure relief an
others mentioned in Bob Morgan's book).


Target processor is also important. Some transformations are easy to
implement following one approach. E.g. modern x86 processors are OOO
processors and implementation of chameleon intervals (the important
part of FAT algorithm which was missed in Bob Morgan's book) is very
important for them because x86 has irregular register file and has
xchg (swap registers) which is very fast in Intel processor (and a bit
slower in AMD ones). Swap is practically free because OOO processor
has register renaming hidden in hardware. Imho the chameleon
intervals is easier to implement in bin packing approach. For other
processors which have no swap insn (and you need 3 xor insns to
implement swap), implementation of the chameleon intervals is less
important. Irregular register files creates also a big problem for
the register allocator and this problem can be solved easier in one
approach than in another.


Because the comparison of linear-scan and George's graph colouring in
the original article about linear-scan algorithm has shown that it is
worse that George's approach, I believe the article. It corresponds
my belief that a good register allocator (for modern state-of-the art
compilers) should have some form of graph colouring. Bin packing
worked good for DEC compiler (which was not so aggressive) and Alpha
as a target processor had big regular register files.


I think a perspective approach is to combine solution different
allocation tasks (like assigning, splitting, coalescing, and
rematerialization) no to solve them as as separate tasks in some order
with iterations. Colouring based on graph fusion is one of them (as I
understand it is used in Intel compiler). As a research work, it
would be interesting to use some metaheuristic (like taboo search) to
get a good combinations of small register allocation transformation
(assigning to one pseudo-register, coalescing pair of pseudos,
splitting one pseudo live range etc) because it will work faster than
usage ILP or PQBQ solvers (but still not fast enough for a commercial
compiler).


It is also interesting too try other colouring algorithms. I remember
early articles of Ershow and Starkman (unfortunately that was
published on russian only in 50s and early 60s) about interesting
algorithms of graph colouring different from Chaintin's approach. The
algorithms were based on graph node cluing for memory cells economy
(memory was precious that time).


If you want write a good register allocator, you should try many
approaches and a good infrastructure (framework) is the most important
part of this.


And last thing, register allocation area has probably more patents
than any other compiler optimization area. Linear scan (as many graph
coloring algorithms, and bin packing) is patented. So one should
think about commercial usage of LLVM whose register allocator is based
on linear scan and imho far way from the state-of-the art register
allocator.


> Where can I find these test sources with complicated CFG? Didn't manage
> to find them on SPECInt2000 :( May be, went to the wrong page.




You should have SPEC2000 testsuite for this which is not free. If you
have no access to the SPEC, you could look at perl regexec function
(half time of perlbmk test from SPEC2000 is spent there). It is a
regular expression matcher. The function has a big switch and a lot
of its cases contain loops. There are lot of long and short live
variables (and PRE creates much more pseudos).


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.