Re: Compiler bootstrapping and the standard header files

Christopher F Clark <>
Fri, 20 Mar 2020 06:21:56 -0400

          From comp.compilers

Related articles
Compiler bootstrapping and the standard header files (2020-03-19)
Re: Compiler bootstrapping and the standard header files (Hans-Peter Diettrich) (2020-03-20)
Re: Compiler bootstrapping and the standard header files (Christian Gollwitzer) (2020-03-20)
Re: Compiler bootstrapping and the standard header files (Christopher F Clark) (2020-03-20)
Re: Compiler bootstrapping and the standard header files (cvo) (2020-03-20)
Re: Compiler bootstrapping and the standard header files (2020-03-21)
Re: Compiler bootstrapping and the standard header files (Hans-Peter Diettrich) (2020-03-22)
Re: Compiler bootstrapping and the standard header files (Christopher F Clark) (2020-03-22)
Re: Compiler bootstrapping and the standard header files (Kaz Kylheku) (2020-03-23)
| List of all articles for this month |

From: Christopher F Clark <>
Newsgroups: comp.compilers
Date: Fri, 20 Mar 2020 06:21:56 -0400
Organization: Compilers Central
References: 20-03-018 20-03-019
Injection-Info:; posting-host=""; logging-data="85939"; mail-complaints-to=""
Keywords: practice
Posted-Date: 20 Mar 2020 11:45:40 EDT

Dodi gets the answer basically right. I'm going to say something
similar in slightly different words. The good news is that you are
doing this for C which was designed to be a relatively simple language
to port, although you may even want to use a restricted dialect of C
to make it even simpler. The simpler the dialect, the less runtime
library you need (to get the bootstrap working).

First, there are 3 interacting parts. They are all interconnected,
but still separate.

The compiler itself
The header files
The supporting runtime library

There are also two bits of terminology you need to learn from cross-compiling.


So, the machine you are compiling on and the compiler you are
compiling with are considered the host. The machine that the program
will run on and the runtime library that support it are considered the
target. There are diagrams that illustrate this. They are called
T-diagrams. Here is an ASCII rendition (excuse my drawing skills).

+ ------------------------- +
| host headers target |
+ --- + + ----- + ------------------- +
              | compiler | host headers target |
              + ----------- + ---- + + ----- +
                                                        | compiler |
                                                        + ----------- +

Where the target in the first T is the host in the second T.
Everything else can be different. You can nest this diagram as many
times as one likes. The typical bootstrapping process nests 3 Ts. I
will explain why later.

From that diagram you can see that the headers must match both the
host (compiler) and the target (runtime).

Let's now illustrate that with a couple of different scenarios.

The first simplest scenario is you want to run the resulting program
in the same environment (same target machine, same target runtime
library) as the host environment. This is the way you bootstrap a new
version of the compiler using the same runtime library. This "new
version" might be this new compiler you are building from scratch.

So, you take the program you want to compile (this will be the new
version of the compiler), And plug it into the host box of the first
T. The host compiler takes this program and the header files which
match that compiler and target runtime library and compiles it to a
target program that uses the target runtime library. You now have a
new executable program (after linking) the you can run on the target
machine. This new executable program, just happens to be your new
[version of the] compiler. So, now you can take the source code of
the program again and compile it with the new compiler (using the
header which match that new compiler and target runtime library) and
compile it again. If you repeat this process twice (that is 3 T
boxes), the code generated should be roughly the same. There may be
timestamps or similar artifacts that differ, that you have to filter
out, but otherwise any differences are bugs in the compiler.

The scenario gets a bit more complicated if you are building a
cross-compiler (targeting a different machine than the host machine,
or even just a different (and incompatible) runtime library on the
host machine. In that case, your first host and target are the same
machine, but your second target is a different machine (different
runtime library). You may or may not be able to build a native (host)
compiler on that second machine. It is quite common for embedded
machines to lack all the facilities you need (e.g. file systems) to
run a compiler on them. You don't need a compiler to run on the chip
that runs your car engine or toaster. You just need a compiler that
can generate code for that chip. However, if you are building a
native compiler for that chip, then you need the 3 step T diagram.

Hopefully, you can figure out from this, that:

When compiling your compiler with some other compiler, you use the
header files from that compiler (and that go with that runtime
routine). You will note that cross-compilers (e.g. compilers that run
on an x86 but compile code for an arm machine) may use different
header files than the compiler from the same vendor that target the
host machine. The header files must match both the compiler and the
target runtime and target runtimes for different machines (even for
the "same" compiler) can differ due to linker and OS dependencies.
When compiling your compile with your own compiler, you must use the
header files for your compiler. You may even have two different
copies, if you are developing your own runtime library. One the
matches the original compiler's runtime library, so you can use that
and one the matches your own runtime library so you have something to


Finally, I am going to illustrate this process with one of the first
compiler's I worked on. In 1978 I worked at Softech and we had a
contract to build a cross-compiler from Multics to the Interdata 8/32
for the Jovial language (a new dialect called J73/C). We wanted our
compiler to be written in Jovial, but there was no Jovial compiler on
the Honeywell Multics machines. So, Carl Martin, my mentor at that
time, wrote a translator (effectively a macro package) that translated
a subset of Jovial into PL/I, with PL/l semantics, so you could only
use a subset of Jovial where the semantics of it and PL/I aligned.
But, that was ok, because you don't [shouldn't] need a lot of
sophisticated semantics to write a compiler.

So, then we wrote our first compiler in that subset. We then ran it
through the translator (1st T diagram) and got out an equivalent PL/I
program which we could compile with the Multics PL/I compiler. Then
we ran that compiler through the Multics PL/I compiler (2nd T diagram)
and got out a native Multics executable. Now, we had a program that
you could run on the Mutics machine that would compile Jovial (and
output Interdata 8/32 code--3rd T diagram). We also had a version
that generated code for the Multics machine. I don't know if we ever
built a native Jovial compiler on the Interdata machine (that would
have been a 4th T diagram). In theory we could have, but the compiler
was targeting embedded applications, so I don't know what OS support
there was.

Chris Clark email:
Compiler Resources, Inc. Web Site:
23 Bailey Rd voice: (508) 435-5016
Berlin, MA 01503 USA twitter: @intel_chris

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.