From: | "cr88192" <cr88192@hotmail.com> |
Newsgroups: | comp.compilers |
Date: | Sun, 1 Mar 2009 09:18:02 +1000 |
Organization: | albasani.net |
References: | 09-02-132 09-02-142 09-02-148 |
Keywords: | UNCOL, optimize |
Posted-Date: | 02 Mar 2009 08:32:47 EST |
"Bartc" <bartc@freeuk.com> wrote in
> "cr88192" <cr88192@hotmail.com> wrote in message
>> "Tony" <tony@my.net> wrote in message news:09-02-132@comp.compilers...
>>> Or maybe I'm making the problem to hard (?). Maybe the way to go is to
>>> byte the bullet and generate assembly instructions and stop worrying
>>> about it. Then all I'd need is a good book like the Ron Mak book was
>>> back in it's day. (I haven't paged thru the latest Dragon edition,
>>> but surely I'd be more able to assess what's there now than the last
>>> time I looked at it in the bookstore). It would appear that the newer
>>> texts are too enamoured with GC and exceptions rather than locking
>>> down the most needed basics. OK, so my question really is...
>>>
>>> On modern desktop hardware, would anyone even notice the reduction of
>>> program performance because of the rather stark non-optimised back end
>>> code generation? (My guess is not, for 80% of software).
>>>
>>
>> probably not...
>>
>> actually, even with rather poor assembler, one can still get within 2x
>> or 3x of decently good compiler output (AKA: gcc with default
>> settings...).
>
> I used to find this amusing: When I first wrote compilers in the 80's
> (in my spare time) code efficiency was the last thing I was worried
> about. Yet programs still ran faster than compiled C code.
>
> A couple of tests I've just done with a 10-year-old compiler (with
> little change on the code front) against gcc -O3 showed one program
> twice as fast with gcc, but another 20% slower on gcc.
yes, sometimes one can wonder if the O is really for Obfuscate...
but, yes, the O option is weird, in that sometimes it makes code a
good deal faster, and other times slower than if nothing were done at
all... so, I tend not to worry about it too much in general.
however, on gcc, the -pg option makes things a good deal slower, so it
is a question if one wants faster code or ability to more readily run
the profiler. I have generally opted with the profiler, although it
would be nicer sometimes if it had less of a performance hit (or, was
like a tool one runs on the running program, like the debugger, rather
than something controlled with build options).
it also reminds me of an annoyance I discovered, which is that any use of
chdir() can change where the profiler output goes...
it also annoys me some that there is no good common build system
(namely, that IMO makefiles are still the best option, and they still
suck). configure can, IMO, go back to the fires from whence it came,
and cease its torment to any that actually hope for the app to build
(FFS at least we can easily edit makefiles...).
cmake seems in theory better, but in practice not so much...
...
these kinds of tools seem to almost universally try to run and spit out
specialized makefiles, rather than manage the build process themselves.
> This wasn't bad considering how little effort I put into my compilers
> and how much work and expertise goes into a mainstream compiler
> (although I haven't tried Intel C compiler which is supposed to be
> good).
>
> (I make up the difference by having really nice inline assembler in my
> language...)
>
yeah...
I suspect that the issue is that mostly the optimization in compilers is
done with "optimization wisdom", which often does not match well with the
typical processors (they go through great lengths to do things which may
well actually make things slower), whereas crudely produced code works
generally well.
it is funny that on modern processors, registers and memory are not much
different WRT performance (apart from the cost of the occasional mov, it is
fast if it is in cache, ...). yet, calling conventions like SysV/AMD64 go
through great effort to pass bunches of arguments in registers, yet one of
the first things to be done if the function is non-leaf (most common...), is
to have to store all of these values back into memory to free up these
registers again, for which no space is provided (say, they could have at
least left space where these arguments would have gone on the stack...).
my personal suspicion is that this calling convention will be slower than
would have been an upgraded version of x86 cdecl...
I am left wondering if anyone actually benchmarked any of their design
decisions...
>> early on this was actually an encouragement:
>> I was able to more or less match gcc's performance on some tasks with a
>> crude JIT, which basically converted a good number of specialized
>> bytecodes for a language of mine into machine code.
>>
>> of course, this sent me head-first into compiler writing (me thinking,
>> "well, it can't be too much more work to compile C...", but I was
>> wrong...
>> an interpreter, assembler, and straightforward JIT are easy, but a full
>> fledged compiler is a PAIN...). sadly, it is sort of like with geometry
>> or
>> physics code, namely that the code is not large, but it is a pain to
>> write
>> and work on...
>
>> so, from starting out, it took about 6 months before I was compiling
>> C code with it, and 1.5 years later, it is not doing a whole lot
>> more (no working C# or Java frontends, or even a complete x86-64
>> backend, ...). it is like a huge effort sink...
>
> It used to take me 2-3 months to create a compiler, just to get it
> working enough to use it for my real work. Probably half the features
> I'd put in the language got left out. (This was not for C language
> which I think helped a lot, although my language did the same sorts of
> things.)
>
> On the other hand, I've been tinkering on and off for years now on a
> new language/compiler project which is going nowhere at present. But I
> just think of it as alternative therapy to Sudoku.
>
the great problem with C is C's typesystem.
one is regularly going through hoops around this one...
there are many subtle issues that probably even an experienced C coder would
not be aware of, but show up in all their glory when one faces them.
>> basically, my compiler is more or less split into an upper-compiler,
>> a middle-compiler, and a lower-compiler, although the exact border
>> between the middle and lower compilers is not well defined (the
>> lower compiler is mostly whole lots of specific code-generation
>> functions, ... whereas the middle compiler is mostly all of the
>> "executive logic", ...).
>
> How you ever thought of... starting again? Sometimes you can use the
> experience you've gained to produce a simpler and more streamlined
> version.
>
for a pure C compiler, maybe...
however, I end up trying for more ambitious goals... (supporting more than
just C, ...).
however, to these ends I have considered redesigning my IL, as noted, making
it Scheme based.
however, I have come to doubt my choice of GC. I had the idea this time of
using a reworked version of my precise GC, and I am left realizing that for
coding with it and trying to do things "properly" (making sure all roots are
accounted for, that ref-counts will be kept correct, ...) quickly reminds me
of the terrible pain of using this sort of GC (and why I had not been using
it, even despite the somewhat inferior performance of my more
general-purpose one...). one is left with lots of uncertainty as to whether
all the root handling is correct and all the ref-counts are kept correct.
after a short run of this, I am considering just starting this effort over
(partly), and instead using my generic (conservative, non-ref-counting...)
GC for the Scheme portion, and just living with the produced garbage...
if anything, at least it would have far fewer
code-generating-macro-invocations (I am left thinking that making precise GC
more usable would likely require implementing a specialized source-to-source
compiler, and I am not inclined towards this at present, the biggie issue
being not just that I have to write an interpreter, but also all my other
compiler code would have to deal with all this as well...).
as for the interpreter, I decided against using bytecode, rather making use
of a direct interpreter. the advantage of this, beyond a simpler
interpreter, is that it makes it much easier to provide powerful
macro-processing abilities and maintaining a high level of flexibility in
terms of code structure, which is useful for a compiler (one can have things
like first-class macros, ...).
in the initially partly-written interpreter, many of the special forms were
provided by builtins.
likewise, macro-expansion will be performed inline with interpretation, and
both will share the same binding environments (similar is likely to also be
the case for the IL functions being compiled).
more so, a lot of Scheme I will probably leave out, and there is a lot I
will probably add (such as all the compiler-specific machinery, ...).
so, SXIL will be, technically, a different language than Scheme, but they
will at least be "similar".
Return to the
comp.compilers page.
Search the
comp.compilers archives again.