Re: Compiler project needed

"Charles E. Bortle, Jr." <>
22 Feb 2000 09:31:36 -0500

          From comp.compilers

Related articles
Compiler project needed (Per Olesen) (2000-02-22)
Re: Compiler project needed (2000-02-22)
Re: Compiler project needed (2000-02-22)
Re: Compiler project needed (Charles E. Bortle, Jr.) (2000-02-22)
Re: Compiler project needed (2000-02-22)
Re: Compiler project needed (2000-02-23)
Re: Compiler project needed (Stephen Sulzer) (2000-02-23)
Re: Compiler project needed (Juergen Kahrs) (2000-02-23)
Re: Compiler project needed (Daniel C. Wang) (2000-02-27)
Re: Compiler project needed (2000-02-27)
[7 later articles]
| List of all articles for this month |

From: "Charles E. Bortle, Jr." <>
Newsgroups: comp.compilers
Date: 22 Feb 2000 09:31:36 -0500
Organization: MindSpring Enterprises
References: 00-02-112
Keywords: design

Hello Per,

This is a little long...please bear with me :-)

This is just my opinion (and I don't know if our moderator, John would
agree :-) but to my thinking FORTRAN IV would be a good place to start
(or maybe, if the world is not sick unto death of the endless
variations we have now, a more traditional version of BASIC)

My reasoning is this:

FORTRAN IV (and the original BASIC) were/are line oriented, which
means that you don't have to deal with sentence structures that begin
on one line and end on another. The control structures are minimal
(and fairly simple). For all that, FORTRAN IV is a fairly rich
language, offering you a non-trivial implementation task. (FORTRAN IV
actually provides built in handling of Complex numbers! Which,
incidently, turned out to be fairly simple to implement :-)

FORTRAN IV does have a few tricky features vis a vis the compilation
task, but all the better in helping you learn. (The DO loop is the
only FORTRAN IV structure that is multi-line. The rule is that you
are not allowed to jump into one of these loops from outside the loop,
however, many compilers do not check, this. One wrinkle of this rule
is that you can jump from inside the sequence of statements of the
loop, then jump back into the loop... this is considered an extension
of the loop, and is permitted which makes for some fun with the
implementation of this rule which mixes syntax and semantics ;-)

I wrote a FORTRAN IV compiler some years ago and learned a lot, both
from the acomplishment itself, and also from the mistakes I made along
the way. I originally wrote the scanner pass on an old S100 8080
based machine under CP/M, and then ported that to an Osborne 1 where I
began to write the recursive descent parser pass. I eventually ported
this to an IBM PC where I actually finshed writing the parser, wrote
the code generator, the lister pass, and the run-time package (taking
advantage of a 8087 coprocessor for the math, so that a large part of
the generated code is native floating point machine code.). On top of
this, all the porting also meant not just the machines, but the
several Pascal compilers I had to port the code to. (I wrote the
compiler in Pascal) The code generator generated assembly code, as
this made it easier for me to see if things were working (as well, of
course, as taking some of the burden off of me for things such as
assignment of memory addresses to variables and code.).

Some advise:

Before choosing what language to compile, see what language processors
you have available to you on the machine(s) you will work on. This is
useful because if you choose to compile a language that you already
have a compiler for, you can debug your test programs on this compiler
and compare the results to what your compiler produces, as an aid to
debugging your compiler and your test cases.

Also, this may be important for determining how you will implement
your compiler. If you are not told by your instructor what tools to
use, that is. Anyway, if you don't have compiler generator tools such
as Lex and Yacc available, or you don't want to or are not permitted
to use them, you may want to consider the implementation language in
terms of the parsing method. Pascal, IMHO, is an excellent language
to implement top down parsers, both recursive descent, or LL(1) table
driven parsers. If you want to be "politically correct" you may opt
to implement your compiler in C or C++.

The most important bit of advice I can give is: Whatever language you
choose to compile, *be sure you have a fairly thorough knowldege of
the language and its various features before you begin to implement
the compiler*. For instance, FORTRAN IV has FORMAT statments that
describe the input and output format of variables to be input or
output. FORTRAN IV allows FORMAT specifcations to be read in at run
time, so FORMAT statments are not hard coded but rather are passed to
the run-time I/O routines which interpret them dynamically at
run-time. This will affect how the compiler must handle FORMAT

You have an advantage in that you are in a class with an instructor,
who can provide advise/assistance. When I wrote my FORTRAN IV
compiler (and the PASCAL compiler I am currently writing), I was on my
own, with a stack of compiler writing text books, and, to start with
for the FORTRAN IV, no working FORTRAN compiler to check my test cases
against. (I knew FORTRAN IV fairly well, having learned it a Cal
State, Long Beach, but sill made a few mistakes with its features
along the way.)

All the best, :-)
* *

Per Olesen <> wrote in message
> I'm studying computer science on a Danish university and I'm going to
> write a compiler as a project in a course I'm taking.
[I've written a Fortran 77 compiler. Lexical analysis is miserable,
because spaces don't matter, and requires lots of feedback from the
parser that more modern languages don't need, but other than that it's
not hard to write a straightforward Fortran compiler. The runtime
package, particularly the I/O, was more work than the compiler. But I'd
look for something more modern, probably with inheritance or other more
recent compiler issues. -John]

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.