RFC: project directions...

"BGB / cr88192" <cr88192@hotmail.com>
Mon, 14 Sep 2009 22:24:38 -0700

From comp.compilers

Related articles
*RFC: project directions... cr88192@hotmail.com (BGB / cr88192)* (2009-09-14)**
Re: RFC: project directions... jgd@cix.compulink.co.uk (2009-09-19)
Re: RFC: project directions... cr88192@hotmail.com (BGB / cr88192) (2009-09-21)
Re: RFC: project directions... jgd@cix.compulink.co.uk (2009-09-26)
Re: RFC: project directions... cr88192@hotmail.com (BGB / cr88192) (2009-09-26)
Re: RFC: project directions... jgd@cix.compulink.co.uk (2009-10-03)
Re: RFC: project directions... cr88192@hotmail.com (BGB / cr88192) (2009-10-03)

| List of all articles for this month |

From:	"BGB / cr88192" <cr88192@hotmail.com>
Newsgroups:	comp.compilers
Date:	Mon, 14 Sep 2009 22:24:38 -0700
Organization:	albasani.net
Keywords:	question
Posted-Date:	18 Sep 2009 11:47:31 EDT

mostly I am wondering about comments on any of these ideas.

well, here are a few things I am considering for the possible future of my
compiler project:
I am considering the possibility of using the Win64 calling convention (on
Linux x86-64) as opposed to either the SysV calling convention, or (my
current) use of a customized calling convention I call XCall.

as such, I could make a spec I would call "XCall-2", where the calling
convention for x86-64 is (simply) defined as being the Win64 calling
convention (in this case, 'XCall' would become primarily a name mangling
scheme and assorted ABI details, sort of like the IA-64 ABI).

major reasons:
to save effort, as I am already using the Win64 calling convention on Win64;
to reduce complexity, as it would not require (as many) separate Win64 and
SysV versions of everything;
Win64 is a much simpler calling convention, and more so, I personally prefer
its design more;
I expect it to perform better on common-case code (granted, this would
require testing, but it is my intuition that it would perform better for
common coding practices);
...

however, as a cost:
since it would be the non-native calling convention, there would be
imperfect interfacing with the natively compiled code, in particular WRT
function pointers and 'va_list' and similar.
although automatic stub-generation is likely to glue the calling conventions
together fairly well, the glue is likely to be imperfect in various cases.
...

I am considering making a few minor adjustments to Win64, although these
should not interfere with binary compatibility.

the most notable of which is that I am considering the possibility of using
a special version of a multi-byte NOP opcode following the epilogue as a
means of identifying an exception-handling table (and generally replacing
SEH, however, prologues and epilogues would remain and serve a similar
purpose). the same NOP opcodes could also be used to point to metadata info
mid-code (for example, at the function start, such an opcode could refer to
data about the function, debugging info, ...).

being the multibyte NOP opcode, however, the processor will generally ignore
the opcode (and, the opcodes would "just happen" to point to relevant
metadata).

another consideration:
I could make use of PE/COFF on Linux (as well as on Windows) as a format for
statically-compiled libraries. combined with the previous, this could allow
using much of the same precompiled code on both Windows and Linux (at least
for the same CPU arch).

on Linux, I would likely make use of a custom loader in order to load these
libraries (let's just call them, errm... DLL's...). ELF-based Shared Objects
would still be used, but mostly for Linux-specific stuff. similarly, they
may also be subtly distinguished from Windows-specific DLLs (by some or
another means...).

I could even take this further:

I had been looking some into EBC (or the EFI ByteCode), as a potential
starting point for an eventual "portable" bytecode. the main "extensions" I
am considering at present would be adding support for floating point and
SIMD (errm, because the bytecode would be fairly lame otherwise...).

I would probably do this via a special prefix opcode, which would serve to
extend the available register set and types (vaguely similar to the VEX
prefix), so probably:
16 GPRs; 16 FPU/SIMD regs; ...

opcode types:
Int (32/64) / Float (32/64) / Packed Float (4 floats, 2 doubles) / ...
"Wide Pointers";
...

granted, such EBC would not be compatible with other EBC, but that is OK.
I "could" just define my own bytecode, but EBC seems like a "good enough"
basis (and should be not too difficult to get my codegen to produce).

(I have yet to decide on an opcode number for this prefix).

another related mystery would be, if added, if I should add support to my
x86 assembler, or make a custom assembler (there are pros and cons either
way).

the bytecode would likely be either interpreted, or handled via a kind of
mini-JIT.

note that this bytecode would likely be considered as an alternative to
machine code, rather than as any kind of high-level bytecode. however, this
could also be part of its merit as well.

any comments on any of this?...

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.

RFC: project directions...

"BGB / cr88192" <cr88192@hotmail.com>Mon, 14 Sep 2009 22:24:38 -0700

"BGB / cr88192" <cr88192@hotmail.com>
Mon, 14 Sep 2009 22:24:38 -0700