|[13 earlier articles]|
|Re: Optimizing in assembly language firstname.lastname@example.org (Tom Payne) (2001-03-22)|
|Re: Optimizing in assembly language Eric.Boon@ICT.nl (Eric Boon) (2001-03-22)|
|Re: Optimizing in assembly language email@example.com (Barry Watson) (2001-03-26)|
|Re: Optimizing in assembly language Martin.Ward@durham.ac.uk (2001-03-26)|
|Re: Optimizing in assembly language firstname.lastname@example.org (Joachim Durchholz) (2001-03-26)|
|Re: Optimizing in assembly language email@example.com (Shankar Unni) (2001-03-26)|
|Re: Optimizing in assembly language firstname.lastname@example.org (Randall Hyde) (2001-03-27)|
|From:||"Randall Hyde" <email@example.com>|
|Date:||27 Mar 2001 23:47:02 -0500|
|Organization:||Posted via Supernews, http://www.supernews.com|
|References:||01-03-006 01-03-085 01-03-092 01-03-107 01-03-124|
|Posted-Date:||27 Mar 2001 23:47:02 EST|
"Joachim Durchholz" <firstname.lastname@example.org> wrote
> Assembly is too powerful; there are too many ways to change the
> control flow in unorthodox manners. For example:
> The result of tail call optimization replaces a subroutine call with a
> jump. The disassembler has no way of determining that a jump
> instruction is really a subroutine call other than by simulating stack
> evolution, and that's damn hard, in particular if the original
> assembly was already optimized.
Not that it matters, but presumably the optimizer would be working
with source code, not disassembling machine instructions.
As for tail recursion, a good data flow analyzer could figure this out
in most cases. It's not "damn hard" to do at all. In those cases
where the DFA cannot determine the target address, it can still figure
out what's going on and revert to "safe" optimization mode.
> It's easy to change the return address by twiddling with stack contents.
Again, a DFA can figure this out. I did this kind of work over 15
years ago when I wrote a static code analyzer for 6502 code (which
makes considerable use of this trick since the indirect jump
instruction was weak).
> The output of an OO compiler will generate lots of jump tables. An
> assembly optimizer will have to establish that these tables are never
> overwritten to do any cross-subroutine optimizations. If the OO
> software places these tables in the heap (maybe because it's loading
> the code dynamically), and the heap is managed by a copying garbage
> collector, the disassembler will have a rather hard time determining
> that the addresses in the jump table will stay unchanged. (In fact any
> optimizer that sees the code of a moving garbage collector will have a
> hard time establishing that the GC doesn't change the observable
> system state, so this problem isn't limited to determining the flow
> graph but to optimization in general. The usual solution is to
> withhold the GC mechanism from the optimizer. Of course, this leaves
> the combination of GC and optimization prone to bugs - the optimizer
> may think that an address is fixed while it really isn't.)
(1) We are talking about optimizing assembly source code here,
not optimizing binary machine code. Therefore, worrying about
the output of a compiler isn't an issue.
(2) Of course, we *are* talking about assembly language here
and it's certainly possible to write in assembly anything that
can be done in a HLL. So you still have a valid point.
I'll address that point next.
> And I didn't even start to think about self-modifying code, or code
> that uses some writable data both for display and as a code address or
> machine instruction (well, the latter techniques are luckily not very
> common anymore, but you get the idea).
A common saying in Computer Science is "optimize for the most common
case." Very few people use self-modifying code (in various forms)
these days. As long as the optimizer can recognize such code and not
do anything stupid with it, that's great. It's not like HLL
optimizers are successful at optimizing all code either (it is easy to
trip up most optimizers, particularly if the language supports an
indirect jump, gotos, or in-line assembly).
I've heard a ton of complaints that "an assembly based optimizer can't
do a perfect job." or "an assembly-based optimizer will fail under
these circumstances." So what? If it produces better code most of
the time and less than optimal code some of the time, I'd be real
happy with it. I could even live with an assembly optimizer that
choked on some code once in a great while as long as the optimizer
could be turned off around the offending code (or, less desirable, for
the entire assembly process).
The truth is, 99% of the assembly code I've ever written wouldn't
present any special problems to a traditional optimizer. Perhaps I'm
better than most (or worse than most, depending on how you want to
look at it), but I suspect that if an assembly based optimizer
provided an 80% solution, that would be great. If 19.9% of the time
it couldn't produce better code, I could live with that. If 0.09% of
the time it threw up its hands and said "I can't deal with this code."
then that would be fine. I could even live with it generating bad
code with no indication of error a tiny fraction of the time if there
was some way I could manually verify the output (e.g., emission of
assembly code rather than a binary object file).
Whether or not the resulting code is faster than the code emitted by C
is irrelevant. Lots of code is written in assembly for many different
reasons. Efficiency is only one of them. Most of the code I write in
assembly I don't take any special pains to make it fast. Clearly,
there are some very simple transformations that could double or triple
the speed of my code (e.g., better instruction scheduling). Tools
already exist to help you locate such problems with your code (e.g.,
Intel's VTune). The next step is to actually make the modifications
for you during the compile/assembly process. Even if the "assembly
optimizer" limited itself to doing the obvious and easy (or trivial)
things, I argue this would be a boon to those who use assembly.
Return to the
Search the comp.compilers archives again.