|Vax span-dependent jumps email@example.com (1990-12-23)|
|Re: Vax span-dependent jumps firstname.lastname@example.org (1990-12-30)|
|Re: Vax span-dependent jumps email@example.com (1991-01-02)|
|Re: Vax span-dependent jumps firstname.lastname@example.org (1991-01-03)|
|From:||email@example.com (Henry Spencer)|
|Keywords:||optimize, code, assembler|
|Organization:||U of Toronto Zoology|
|References:||<9012111913.AA07310@cs-sun-fsa.cpsc.ucalgary.ca> <firstname.lastname@example.org.UUCP> <email@example.com>|
|Date:||Sun, 30 Dec 90 00:55:07 GMT|
In article <firstname.lastname@example.org> email@example.com (Chris Torek) writes:
>a special optimization, the assembler will sometimes% use something called
Properly known as branch chaining; it's been around since at least the
Bliss-11 compiler. It wins on code size but typically loses on speed,
especially on modern machines where extra pipeline breaks really hurt.
Our moderator writes:
>[... I wrote the original AIX assembler for the RT PC, another machine with
>span-dependent jumps, by mutating the System III Vax assembler. The code to
>do span-dependent jumps was pretty straightforward, in pass 1 it assumed
>they'd all be the long form and made a table of all of them, then between
>passes 1 and 2 iterated over that table looking for branches that could be
>converted to the short form...
General comment: this is theoretically the wrong approach, since there are
sequences where the decisions interact so that branch X can be short only
if branch Y is also short, and a one-at-a-time algorithm that starts with
all of them long won't shorten them.
In practice, almost all branches are short and *any* reasonable algorithm
will work well enough. Things like iterative scanning of tables are
overkill; a simple FIFO buffer in the output stage, with branches made
long if if they are pushed out the end before their target is known,
will get almost all of them.
Henry Spencer at U of Toronto Zoology, firstname.lastname@example.org utzoo!henry
[It's true, the scheme that I used in the AIX assembler won't produce optimal
results, but since the code already was there and worked, I went with the
flow. On the other hand, perfect branch optimization is NP complete so
barring major positive news in the P=NP department, any perfect method is
very slow. A paper by Tom Szymanski in the CACM ten or so years ago tells
more than you'd ever want to know about the topic, I can dig up the reference
if there's interest. -John]
Return to the
Search the comp.compilers archives again.