Re: Bison =?UTF-8?B?ZGV0ZXJtaW5pc+KAi3RpYyBMQUxSKDEpIHBhcnNlciBm?= =?UTF-8?B?b3IgSmF2YS9DKysgKGtpbmQgb2YgY29tcGxleCBsYW5nYXVnZSkgd2l0aG91dCA=?= =?UTF-8?B?J2xleGFyIGhhY2snIHN1cHBvcnQ=?=

BGB <>
Sun, 26 Aug 2012 19:37:05 -0500

          From comp.compilers

Related articles
=?UTF-8?Q?Bison_determinis=E2=80=8Btic_LALR=281=29_parser_for_Java=2FC (2012-08-17)
Re: Bison =?UTF-8?B?ZGV0ZXJtaW5pc+KAi3RpYyBMQUxSKDEpIHBhcnNlciBm?= =?U (Hans-Peter Diettrich) (2012-08-18)
Re: Bison =?UTF-8?B?ZGV0ZXJtaW5pc+KAi3RpYyBMQUxSKDEpIHBhcnNlciBm?= =?U (2012-08-20)
Re: Bison =?UTF-8?B?ZGV0ZXJtaW5pc+KAi3RpYyBMQUxSKDEpIHBhcnNlciBm?= =?U (BGB) (2012-08-21)
=?UTF-8?Q?Re:_Bison_determinis=E2=80=8Btic_LALR=281=29?= =?UTF-8?Q?_pa (BartC) (2012-08-22)
Re: Bison =?UTF-8?B?ZGV0ZXJtaW5pc+KAi3RpYyBMQUxSKDEpIHBhcnNlciBm?= =?U (BGB) (2012-08-26)
Bison deterministic LALR parser for Java/C++ (BartC) (2012-08-29)
speeding up C recompilation, was Re: Bison deterministic LALR (BGB) (2012-09-04)
Re: C include handling, was Bison deterministic LALR (Marco van de Voort) (2012-09-05)
| List of all articles for this month |

From: BGB <>
Newsgroups: comp.compilers
Date: Sun, 26 Aug 2012 19:37:05 -0500
References: 12-08-005 12-08-006 12-08-009 12-08-014 12-08-015
Keywords: C, performance
Posted-Date: 29 Aug 2012 00:27:26 EDT

(this post had basically been sitting around several days unfinished and
unsent, mostly as I was more busy with other stuff...).

On 8/22/2012 8:04 AM, BartC wrote:
> "BGB" <> wrote in message
>> On 8/20/2012 8:35 AM, Anton Ertl wrote:
>>> Code generation and optimization do not change the relation between
>>> the time spent in scanning and in parsing. Moreover, if the compiler
>>> spends most of the time in code generation, optimization and/or
>>> "more", there is even less reason to worry about parsing speed.
>> (sorry in advance for sort of wandering off on a tangent...).
>> a major thing that I think skews things so much in the case of C is just
>> how much stuff can get pulled in by the preprocessor, the bulk of which
>> ends up quickly being discarded by subsequent stages (and much of the
>> rest is only minimally processed by later stages).
> I think much of that is the fault of the language and over-reliance on the
> pre-processor. Instead of relying on neat language features to minimise the
> sizes of headers, everything has to be spelled out in a long-winded way.
> Lack of proper namespaces (in lists of enumerations for example) means
> identifiers are long and unwieldy, which doesn't help.

well, C is C.
other languages also have things like import, but C doesn't, so alas.

>> if we spend 1000ms preprocessing and parsing the code, and 10ms or 100ms
>> compiling it, where has most of the time gone?...
> But even given all that, there are ways of dealing with huge header
> files so
> that it is not necessary to repeatedly tokenise and parse the same headers
> over and over again (for recompiling the same module, or compiling many
> modules all sharing the same headers).
> I've no idea whether many C compilers actually bother though; perhaps it's
> easier to just recommend a faster computer..

the problem here is that, although it isn't too hard to figure out
possible optimizations, it is much harder to make them work in ways
which don't violate the C standard.

another issue is that things like precompiled headers are non-standard,
and there is no real agreed-on convention for "hey, compiler, feel free
to use precompiled headers here".

like, say, "#pragma precompiled_header" or similar (say, put before the
include guard to indicate to the compiler that this header may be used
as a precompiled header).

even if supported, it wouldn't generally be found in OS headers, meaning
it would be necessary to wrap the headers to use it.

a partial solution is to allow an option of just lumping together a
bunch of code and compiling it all at once, which can often be faster,
but isn't really applicable in many cases.

>> OTOH, in my scripting language (which doesn't use headers), I have
>> generally been looking at millisecond-range compile times.
> Well, that's more like it. Compilation time simply shouldn't be an issue,
> not for compiling a single module anyway.
> And has never been a problem for me, ever, no matter what hardware I was
> using, partly thanks to using my own tools.
> I only get a delay when there are interdependencies and I just recompile
> everything for simplicity. Then I might have to twiddle my thumbs for 5-10
> seconds. But then, I now have to rely on some external tools..

this is part of the issue.

my language instead just uses importing.
so, it is possible to "import" a module, and see whatever is declared in it.

currently, the language uses a mix of a Java and C# style model, where
importing gives the qualified name for the module, which is interpreted
as giving its path (relative to the VM's search path).

within each module, a "package" may be declared, which basically puts
the contents of the module into a given package. functionally, this is
more similar to the use of "namespace" in a language like C#. otherwise,
whatever is declared in the module is placed in the toplevel (it is also
possible to put contents in other packages as well).

there are edge cases here, so modifiers end up being needed both for
package and import directives, mostly to "fine-tune" the behavior.

the current default behavior though is "load the module if-needed, using
its qname, and then try to import the package named by the qname".

currently, there is no direct analogue of Java's "import everything in a
directory" semantics, but I have considered adding something like this,
only I have not yet gotten around to it.

but, in any case, since loading modules doesn't require processing a
small mountain of text in the process, it tends to go a lot faster (and
puts a lot less strain on the compiler/VM as well).

>> although admittedly most of my compilers have tended to be pretty dumb,
>> typically working like (for the upper end):
>> split into tokens and parse the code directly into an AST;
>> perform basic simplifications on the AST (evaluating constant
>> expressions);
>> perform basic type-analysis / inference (generally forward propagation
>> driven by declarations and assignment operations);
>> flatten this out into a bytecode-style format (binary serialization is
>> optional).
>> this is then currently followed by a backend which translates the
>> bytecode into threaded-code and runs this (generally also pretty fast,
>> and generally functions/methods are translated on call).
> I have two current compilers, one native code, and this one for bytecode
> for
> a dynamically typed, non-type-inferred language:
> source -> lexer/parser -> types -> code generator -> optim -> binary
> bytecode file
> The type pass does very little here, mainly checking l-values and reducing
> constant expressions. The optim pass does very little too, just reducing
> some common bytecode combinations into one. Nevertheless, the resulting
> bytecode, even with a straightforward, non JIT-ing interpreter, can make a
> basic lexer run at some 3M tokens/second as I mentioned in another post.
> The native code compiler is more involved (I hate these ones! But I need
> one
> to implement the interpreter):
> source -> lexer/parser -> names -> types -> intermediate code generator ->
> target code generator -> optim -> asm source file
> The optim stage does some peephole stuff, but haven't gone as far as having
> some variables allocated to registers. Last time I checked, it was
> perhaps 40-50% slower than gcc-O1, averaged over ~20
> integer/floating-point-intensive benchmarks. That might do me.

for my C compiler, I had a bytecode intermediate stage, but no bytecode
interpreter (apart from a minor oddity that the optimizer logic could be
used itself as a limited interpreter).

this bytecode was then translated into native code.

however, this was not well maintained, and the bytecode format in
question is no longer in use (though the bytecode used by my scripting
language is a "close sibling", it has gone in a different direction in
several ways, and currently only ever goes into being threaded-code).

then I am basically stuck with things as-is, as I am sadly too busy and
raw performance isn't a big enough priority to justify either compiling
my current live bytecode fully into native code, or for that matter to
revive my C compiler.

so, for the C compiler it was basically like:

( preprocessor -> tokenizer+parser -> AST reducer -> bytecode-compiler
-> (RPNIL) )
( (RPNIL) -> codegen passes -> (ASM) -> assembler -> (COFF) ->
runtime-linker )

the codegen basically was a multi-pass stack machine.
any values created were pushed to the stack, and operations would pop
values and push results (internally mapped to register or memory
operations). in the compiler, the compiler would "contain" any registers
or variables, rather than representing a physical memory region. entries
on the main value-stack were regarded as immutable. type-analysis was
also performed on the stack.

if values turned out to be constant here, then they would be evaluated
directly (and the literal value being placed on the stack instead).

for the scripting VM, the model is a little different, and type analysis
has been moved largely before emitting bytecode, so in this case it is a
little more like with the JVM or similar.

so, for my scripting VM:
( tokenizer+parser -> AST reducer -> type-inference -> bytecode-compiler
-> (BSIL) )
( (BSIL) -> threaded-converter -> (linked-list indirect threaded-code) )

what differences do exist make interfacing my C frontend to my scripting
backend a bit problematic though. a big issue at the moment is that
naive translation of C ASTs to the bytecode would produce giant modules
(with all the header contents as well) and there is no "good" way to
eliminate this after-the-fact.

as-is, in the script VM bytecode, any structures/typedefs/prototypes/...
end up directly in the bytecode, and are "constructed" by executing the
toplevel, which is a bad scenario for C.

otherwise, my VM currently handles C metadata by storing it in a
database-format, which would likely need to be used along with the
bytecode in this case (and omitting all of these declarations from the
toplevel). normally, these databases are used by the VM to annotate
natively-compiled code, and are currently inserted into the DLL
post-compile via a tool (technically, my C compiler front-end is used to
perform analysis and build these databases, from headers, but a native
compiler produces the actual machine-code).

this would likely mean producing object modules containing both bytecode
and a database, and likely having a "link" stage which merges the
database contents.

not that any of this can't be done though, but it is harder to justify
doing so...

> --
> Bartc
> [I've seen C compilers that keep preparsed versions of headers. Dunno
> what they do with #if. Also see Microsoft's C# and other .NET languages,
> that put all of the type info in the objects, so you can use the object
> as a compiled include file. -John]

AFAIK, the preparsed/precompiled headers for C generally handle #if and
#ifdef and similar during the preprocessor as usual. this seems to be a
large part of why there are many restrictions on the use of precompiled
headers in those compilers which support them.

AFAICT, languages like C# delay commands like #if or #ifdef until later
(and impose restrictions on how they may be used). IIRC, they are
generally handled at linking or at JIT.

although it "could" be possible to move #if and friends into a later
compilation stage, it is not really possible to do this without
violating the C standard in the process.

one possible compromise could be to introduce another compiler
extension, such as:
#define_target /name/

which would basically say to the compiler "this particular define will
be handled on a per-target basis" and thus delay certain defines to a
later compilation stage.

#define_target FOO
#ifdef FOO

could have the preprocessor spit out something like:
__target_ifdef(FOO) {

but would retain the normal behavior for use of conventional #define's
(and thus generally avoid breaking C standard conformance, or breaking
code which uses macros in "clever" ways to alter the syntax, 1).

1: consider examples like this:
#define SOMEFEATURE_END *##/
or something as mundane as:
#ifndef __cplusplus
#define EXTERN_C extern "C"
#define END_EXTERN_C }
#define EXTERN_C
#define END_EXTERN_C

my script-VM actually does something vaguely along these lines, and
folds the contents of any ifdef or ifndef statements into their own
bytecode-blocks ("ifdef" and "ifndef" also have their own special
bytecode instructions). these are handled when code is converted into
threaded code (the instruction either becomes a special-call, or a no-op).

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.