Re: Chicken or the Egg?
10 Oct 2000 00:57:42 -0400

          From comp.compilers

Related articles
Re: Chicken or the Egg? (2000-10-10)
Re: variable length instructions, Chicken or the Egg? (Joachim Durchholz) (2000-10-12)
Re: variable length instructions, was Chicken or the Egg? debray@CS.Arizona.EDU (2000-10-12)
Re: variable length instructions, Chicken or the Egg? (Randall Hyde) (2000-10-15)
Re: variable length instructions, Chicken or the Egg? (2000-10-18)
Re: variable length instructions, Chicken or the Egg? (Walter Banks) (2000-10-19)
| List of all articles for this month |

Newsgroups: comp.compilers
Date: 10 Oct 2000 00:57:42 -0400
Organization: Compilers Central
Keywords: linker, comment

In article <> you write:
> Now this would be really simple however some instructions can be either
> 2 words (64 bits) OR 1 word (32 bits). What makes this the "chicken or the
> egg" type of problem is that the opcode size changes depending on the size
> of the jmp.
> For example in the above instructions, jmp label1 would only be
> 1 word long, but if label1 was further than 4095 words, jmp label1
> becomes 2 words and would then become jmp -41.

So you have instructions whose size depends on their operands.
A processor that utilised this extremely is the inmos transputer, so
that may be a valuable hint for you. I have no idea how commercial
compilers worked, but I wrote my own assembler/linker for transputers
and followed the basic algorithm from Inmos' "Compiler Writer's Guide".

Instead of a two pass linker, I needed an n-pass linker. It starts out
by assuming that all instructions could be coded with the minimal size, so
after the first pass, you have an optimistic guess. The following passes
enlarge instruction sizes, if needed. Repeat until no instruction needs
to be enlarged. It may be necessary to encode an jump/call instruction in
a larger size than its operand needs, because its own size may influence
the operand:

o If you encode it using a short format, the operand may enforce to
      use a long format.

o If you encode it in the long format, the operand will change so that
      a short format would suffice.

That's you must never shrink the instruction size and go with a long format,
once you determined that a short one would not suffice.

While this sounds horrible, the most I ever saw were around 16 passes for
real programs, like a statically linked version of lcc under Minix.

[Generating instructions with variable length addresses is a well understood
problem. The PDP-11 Unix assembler did it in about 1973, and there was
a paper by Tom Szymanski in the CACM about that time. I always preferred
the other approach, start with everything long, then shorten until there's
nothing left to shorten, which has the advantage of being fail-safe, all
of the intermediate stages being valid code. -John]

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.