Re: object code vs. assembler code

henry@zoo.toronto.edu (Henry Spencer)
Sun, 21 Feb 1993 00:35:18 GMT

          From comp.compilers

Related articles
object code vs. assembler code John_Burton@gec-epl.co.uk (1993-02-19)
Re: object code vs. assembler code byron@netapp.com (1993-02-20)
Re: object code vs. assembler code henry@zoo.toronto.edu (1993-02-21)
| List of all articles for this month |
Newsgroups: comp.compilers
From: henry@zoo.toronto.edu (Henry Spencer)
Keywords: assembler
Organization: U of Toronto Zoology
References: 93-02-105
Date: Sun, 21 Feb 1993 00:35:18 GMT

John_Burton@gec-epl.co.uk writes:
>Is there any good reason why many compilers produce assembly language as
>this seems to be a big performance hit with no gain by doing this. I can't
>see that producing obejct code is *that* much harder.


Depends on what machine you're on. Back when the 68000 was new, a friend
of mine -- in the thick of 68k compiler work at the time -- said that his
experience was that it was harder to write the assembler than the compiler
code generator. (The 68000 instruction encoding is a bit ugly.)


The advantages of producing assembler output are basically:


1. It's less work for the compiler writer. Emitting instructions is
typically no big deal -- writing `emitw(010213)' is not much
harder than writing `emit("mov r2, *r3")' -- but getting the
symbol tables right can be complicated.


2. ASCII output makes debugging simpler. You need *some* way to read
the final output...


3. All the knowledge of how to generate actual object modules can be
centralized in one place. This is a non-trivial issue if the
format is complex or if it ever has to change, and it means
that bug fixes can be applied in one and only one place.


4. The assembler may perform non-trivial optimizations, e.g. filling
delay slots, that strengthen point #3. Not uncommon nowadays.


The main disadvantage is that an incompetently-implemented assembler can
make the overall process seriously slower. And even with a good
implementation, you necessarily take some efficiency hit in having the
assembler rediscover things like symbol-table contents that the compiler
already knows.


An interesting angle on this issue is what Plan Nine did (take a look at
research.att.com/dist/plan9doc/8.Z [it's PostScript] for details). The
compiler's output is essentially what you'd get out of the first pass of
an assembler -- a binary form of assembler. (There is an assembler, a
small program that turns a suitable text form into the binary form.) The
loader incorporates most of what's normally in the second pass of an
assembler, plus the usual linker functions, plus assorted optimizations.


The paper does mention one unhappiness with this setup: compiling of
multiple source files is trivially parallelizable, but the final loading
step isn't, so there is interest in moving some of the loader's work back
into the compiler. This would presumably result in something closer to
the traditional model.


(One note of caution: as was recently discussed here, the paper's
code-efficiency comparison to GCC is suspect because it probably used an
old and poorly-tuned GCC.)
--
Henry Spencer @ U of Toronto Zoology | henry@zoo.toronto.edu utzoo!henry
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.