Re: Best "simple" C Compiler I've ever seen

federation2005@netzero.com
Mon, 30 Jun 2014 12:42:04 -0700 (PDT)

          From comp.compilers

Related articles
Best "simple" C Compiler I've ever seen andrewchamberss@gmail.com (2014-05-04)
Re: Best "simple" C Compiler I've ever seen federation2005@netzero.com (2014-06-30)
| List of all articles for this month |

From: federation2005@netzero.com
Newsgroups: comp.compilers
Date: Mon, 30 Jun 2014 12:42:04 -0700 (PDT)
Organization: Compilers Central
References: 14-05-013
Injection-Date: Mon, 30 Jun 2014 19:42:05 +0000
Keywords: C, standards, library
Posted-Date: 30 Jun 2014 23:22:35 EDT

On Sunday, May 4, 2014 8:23:05 PM UTC-5, andrewc...@gmail.com wrote:
> https://github.com/rui314/8cc
> Self hosting, extremely well written, and seems geared towards simple
> but complete code rather than being a complicated monolith. In my
> opinion this compiler hasn't gotten enough attention so I'm posting it
> here for people who haven't seen it.


Simple indeed it is. However, one of the issues I've always had with HLL's --
until recently -- was the neglect of the (by far) most important part of a
compiler: how it defines the enclosing envuronment; i.e. the "run-time"
system.


In particular, by "neglect" what I mean is neglect of the pressing need in
many applications to build in concurrency at the language level. In other
words, a run-time system that is multi-threaded at the core.


With the advent of the latest ISO standard, a lot of this has begun to be
addressed.


This is something I've been going through for a while now. The last open draft
available, in the case of C, I think is n1570, as indicated in the 8cc link
you provided. For C++ I think n3242 is the last open draft. With the way the
were written, I ended up reformatting and cleaning up my copy (of the earlier
version n1516 however) -- 22 pages for the first 5 parts alone (out of 7) and
225 pages or so for parts 1-12 of the C++ standard. I wish I could post the
cleaner version, it's easier to read and understand.


But anyway: the change is not merely a minor tweak in an evolving language,
but an entire paradigm shift. In particular, Section 5.1.2.4 of the standard
is ground zero of this paradigm shift, not merely a minor addition to the
standard. The same wording was also used in the revised C++ standard, in
section 1.10.


So, there is the challenge for anyone wishing to keep that same simplicity and
perhaps to even experiment with the compiler you cited (because it's already
so simple to begin with and thus easy to work on). Build a new run-time system
that is sufficiently powerful to handle the needs of contemporary CPUs and to
embody the latest ISO standard.


A long time ago, in the context of embedded systems, I devised a very simple
run-time framework I called "The World's Smallest Multitasking Kernel" (WSMK).
The source of the name is that it comprises just 4 system calls: a "Pause"
routine" to get a thread to wait on an event, a "Resume" to resume whoever is
waiting on the event, a "Spawn" to create a new thread and "Exit" to commit
suicide.


The simplicity in its conception lay in two key places:
(1) Exit was put on the bottom of the stack of a newly spawned thread


(2) No task control blocks: everything is put on the stack when pausing (Not
an essential feature of WSMK though)


(3) Integration with interrupts and exceptions: all of them are encapsulated
within the run-time system so that each produces a "resume" on its own event.


Thus, an application would never need to write any interrupt handlers or
callbacks, but could write straight-line threads.


(4) The ability to nonetheless go underneath the run-time system with direct
implementations of interrupt handlers.


The pressing question was always this: could something this simple actually
work for handling more complex multi-threaded applications. As it turns out,
what's described here is essentially equivalent to what's known as the "Pi
Calculus" -- something I only recently learned of a few months ago. The Pi
Calculus is complete. (That means anything any concurrency formalism is able
to express can be expressed within the framework of the Pi Calculus).


But what's missing is the question of how to define scoping, linkage,
persistence, etc. for variables; and more importantly, a *standard* that
mandates how these things are to be treated. That's what the latest ISO
standards have started to address.


Also missing is a question of what language constructs are to be atomic, where
and how basic processes (e.g. printf) could be interspersed with other
threads' actions, etc. That too has been partially resolved with the somewhat
detailed concurrency formalism spelled out in the above-mentioned section.


So, this gets us back to the central question: which is that the most
important part of the compiler is and has always been ... how everything is to
be hung together and orchestrated, the run-time system.


I think that much of this can be simplified yet further by replacing the
traditional call-return semantics by continuation semantics. And so, what I'm
seriously considering doing is devising such a base-level formalism that has
the multi-threaded run-time system, along with compilation that is entirely
based on continuation semantics (i.e. no run-time stacks or frames per se.)


Interestingly, in the process of reviewing different software for numeric
processing (much originally in Fortran) I notice that one library -- AlgLib --
appears to have put in its own multi-threaded runtime system ... but one that
only looks like the next to last stage in the evolution that WSMK underwent in
1990-1991 before it reached its final form. The difference is that AlgLib's
runtime system runs on Windows and Linux, whereas WSMK never (yet) went beyond
DOS.


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.