Re: using C as intermediate code, was Writing a compiler

thomas.mertes@gmx.at
Sat, 8 Nov 2008 05:46:21 -0800 (PST)

          From comp.compilers

Related articles
Writing a compiler andresjriofrio@gmail.com (andresj) (2008-10-20)
Re: Writing a compiler m.collado@lml.ls.fi.upm.es (Manuel Collado) (2008-10-24)
Re: Writing a compiler Ibeam2000@gmail.com (Nick) (2008-10-26)
Re: Writing a compiler andresjriofrio@gmail.com (andresj) (2008-10-29)
Re: using C as intermediate code, was Writing a compiler thomas.mertes@gmx.at (2008-11-08)
| List of all articles for this month |

From: thomas.mertes@gmx.at
Newsgroups: comp.compilers
Date: Sat, 8 Nov 2008 05:46:21 -0800 (PST)
Organization: Compilers Central
References: 08-10-037 08-10-046 08-10-047 08-10-056
Keywords: code, C
Posted-Date: 08 Nov 2008 11:21:15 EST

On 30 Okt., 05:47, andresj <andresjriof...@gmail.com> wrote:
> On Oct 26, 4:57 am, Nick <Ibeam2...@gmail.com> wrote:
>
> > If I can make a suggestion, use C or C++ as target language. Here
> > you don't have to reinvent subroutine calling and the like, and you
> > maintain compatibility with other things on the OS. Not to mention
> > ease of moving around different OSes. And troubleshooting. Much
> > easier.
> > [Quite a reasonable idea unless your plan was to learn about code
> > generation. -John]
>
> That is a reasonable idea, thanks. :-) The only problem is that there
> would be a dependency on a complicated C or C++ compiler, which might
> make the language more difficult to manage.


This depends on how much compiler specific (in contrast to language
defined) features are used. If you restrict e.g. to C89 there are
not so much differences in the C compilers.


I found the following differences:
    - Empty structs are not allowed in some C compilers (E.g in the
        MSVC C compiler).
    - The number of parenthesis levels (64 for older and 256 for newer
        MSVC C compilers). Naturally when C programs are generated such
        limits are very bad.
    - Using of floating point NaN and Infinite in contrast to
        exceptions raised or the whole program terminated.
    - Handling of integer exceptions (posix uses the SIGFPE signal for
        integer division by zero while MSVC uses it's structured
        (unportable) exceptions.
    - Automatic casting between integer (long) and pointers (most unix
        compilers including GCC and also MSVC issue warnings for such
        casts, but allow them. Borlands bcc32 just issues errors and
        forbids such implizit casts).
    - The number of significant characters in an identifier may make
        problems. For internal identifiers 32 characters are, according
        to the C89 standard, significant. But: For identifiers with
        external linkages as few as 6 characters are significant and
        even the case distinctions may be ignored (Until now I did not
        found a linker which enforces this 6 character limit).


Additional there are other differences you may have to deal with:
    - Little or big endian representation. Note that it is possible
        to write C code which is independend of the endianess of the
        machine.
    - The representation of negative integers (ones or twos
        complement).
    - Integer division is, according to the C89 standard, allowed to
        truncate towards zero or towards minus infinite (All C compilers
        that I know about truncate integer divisions towards zero).


The differences between the libraries provided by different
compilers / operating systems is much bigger. Not every programming
language trys to deal with library / os differences.


A list of some library / os differences is:
    - Unicode support with wide chars or with Utf-8 (this has an
        influence on many library os interfaces).
    - Reading directory contents with opendir() and readdir() or with
        findfirst() and findnext().
    - mkdir() with one or two parameters.
    - Path delimiter / or \ .
    - File permitions (unix and windows permitions are different).
    - console access (terminfo, termcap, curses, windows console).
    - File seeks for files with more than 2 (or 4) gigabyte.
    - Winsockets or real Unix sockets.
    - Graphics libraries (X11, gl, gdi, directx).


Usually C uses macros to handle compiler / library differences.
In the Seed7 interpreter, the Seed7 to C compiler and the Seed7
libraries several defines and driver libraries are used to handle
C compiler and library differences. This defines are explained
in the file seed7/src/read_me.txt (in the Seed7 release).


Greetings Thomas Mertes


Seed7 Homepage: http://seed7.sourceforge.net
Seed7 - The extensible programming language: User defined statements
and operators, abstract data types, templates without special
syntax, OO with interfaces and multiple dispatch, statically typed,
interpreted or compiled, portable, runs under linux/unix/windows.



Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.