Related articles |
---|
[3 earlier articles] |
Re: Modern compilers for ye olde architectures derek@NOSPAM-knosof.co.uk (Derek Jones) (2021-10-06) |
Re: Modern compilers for ye olde architectures laguest@archeia.com (Luke A. Guest) (2021-10-06) |
Re: Modern compilers for ye olde architectures anton@mips.complang.tuwien.ac.at (2021-10-06) |
Re: Modern compilers for ye olde architectures theom+news@chiark.greenend.org.uk (Theo) (2021-10-06) |
Re: Modern compilers for ye olde architectures laguest@archeia.com (Luke A. Guest) (2021-10-06) |
Re: Modern compilers for ye olde architectures pkk@spth.de (Philipp Klaus Krause) (2021-10-06) |
Re: Modern compilers for ye olde architectures anton@mips.complang.tuwien.ac.at (2021-10-15) |
Re: Modern compilers for ye olde architectures pkk@spth.de (Philipp Klaus Krause) (2021-10-18) |
Re: Modern compilers for ye olde architectures pkk@spth.de (Philipp Klaus Krause) (2021-10-18) |
Re: Modern compilers for ye olde architectures pkk@spth.de (Philipp Klaus Krause) (2021-10-18) |
Re: Modern compilers for ye olde architectures gah4@u.washington.edu (gah4) (2021-10-21) |
Re: Modern compilers for ye olde architectures 480-992-1380@kylheku.com (Kaz Kylheku) (2021-10-22) |
Re: Modern compilers for ye olde architectures dave_thompson_2@comcast.net (2021-11-14) |
From: | anton@mips.complang.tuwien.ac.at (Anton Ertl) |
Newsgroups: | comp.compilers |
Date: | Fri, 15 Oct 2021 07:37:30 GMT |
Organization: | Institut fuer Computersprachen, Technische Universitaet Wien |
References: | 21-10-007 21-10-012 21-10-015 |
Injection-Info: | gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="63942"; mail-complaints-to="abuse@iecc.com" |
Keywords: | architecture, code |
Posted-Date: | 16 Oct 2021 13:45:08 EDT |
Philipp Klaus Krause <pkk@spth.de> writes:
>See "Optimal Register Allocation in Polynomial Time". A graph-coloring
>approach that can handle irregularities well (as long as there are not
>too many registers). SDCC uses such a register allocator for some
>backends, including z80.
Interesting paper. I had trouble following the theoretical part, but
looking at the practical part and taking the part that I understood
about the theory, you try out all possible assignments of all
registers to the maximum clique (live ranges that are alive at the
same time), and because the number of registers is a constant (for a
given architecture), this is a polynomial.
You discuss splitting live ranges at control-flow-graph boundaries,
but it seems to me that in some cases you want to split live ranges
between instructions within a control-flow node; this does not change
the complexity-theoretic result, but of course the implementation.
If only the maximum clique size plays a role, it's as if the
assignments to the cliques are independent; but then you have to
reconcile the assignments for different cliques by introducing reg-reg
move instructions, and take these costs into account, and optimize
them, too; so I don't see that the cliques can be treated as
independent. And you probably don't, because there are additional
factors in the complexity. In any case, I am missing something, and
maybe you have an idea what it is.
I am wondering about one thing in the empirical results in your paper:
Why is the code size not monotonously falling with increased numbers
of assignments? Are these independent runs with different
(pseudo-random) assignments?
I had some questions which were mostly answered by the paper, but
maybe you can offer additional insights:
* Am I right that earlier register allocators were bad for irregular
register sets, and that's why general-purpose registers won once
compilers became dominant? Why did general-purpose registers become
dominant?
The advantage of general-purpose registers is given in that paper as
reducing the complexity of the algorithm by a factor of:
(2*(tree-width(G)+1)*#registers)!
E.g., for tree-width 2 and 9 registers, it's 54! = 2.30843697*10^71
Which poses the question: In your empirical work you stopped at 10^8
assignments (in some cases, less). How did you get provably optimal
assignments on the Z80 with its 9 registers?
* What are the key points why your work can deal with irregular
register sets, and earlier approaches are pretty bad at that?
It seems to me that you use the CPU power available now to try out
many different assignments, while earlier work has balked at that.
* Do you have any idea why no good approach for dealing with irregular
instruction sets has been found in, say, the 1970s and 1980s when
irregular register sets were more common (e.g. on the Z80 and the
8086).
Your approach is an (ideally exhaustive) search that uses more CPU
power (and memory?) than was available then. At the time, one would
have resorted to heuristics, but apparently no general effective
heuristics have been found.
- anton
--
M. Anton Ertl
anton@mips.complang.tuwien.ac.at
http://www.complang.tuwien.ac.at/anton/
Return to the
comp.compilers page.
Search the
comp.compilers archives again.