Re: Is This a Dumb Idea? paralellizing byte codes

anton@mips.complang.tuwien.ac.at (Anton Ertl)
Fri, 28 Oct 2022 17:06:55 GMT

          From comp.compilers

Related articles
[3 earlier articles]
Re: Is This a Dumb Idea? paralellizing byte codes gah4@u.washington.edu (gah4) (2022-10-22)
Re: Is This a Dumb Idea? paralellizing byte codes anton@mips.complang.tuwien.ac.at (2022-10-23)
Re: Is This a Dumb Idea? paralellizing byte codes anton@mips.complang.tuwien.ac.at (2022-10-23)
Re: Is This a Dumb Idea? paralellizing byte codes alain@universite-de-strasbourg.fr (Alain Ketterlin) (2022-10-23)
Re: Is This a Dumb Idea? paralellizing byte codes gah4@u.washington.edu (gah4) (2022-10-26)
Re: Is This a Dumb Idea? paralellizing byte codes 864-117-4973@kylheku.com (Kaz Kylheku) (2022-10-27)
Re: Is This a Dumb Idea? paralellizing byte codes anton@mips.complang.tuwien.ac.at (2022-10-28)
Re: Is This a Dumb Idea? paralellizing byte codes anton@mips.complang.tuwien.ac.at (2022-10-29)
| List of all articles for this month |

From: anton@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.compilers
Date: Fri, 28 Oct 2022 17:06:55 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
References: 22-10-046 22-10-048 22-10-056 22-10-059
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="24437"; mail-complaints-to="abuse@iecc.com"
Keywords: interpreter, optimize
Posted-Date: 30 Oct 2022 00:50:20 EDT

Alain Ketterlin <alain@universite-de-strasbourg.fr> writes:
>anton@mips.complang.tuwien.ac.at (Anton Ertl) writes:
>
>> Alain Ketterlin <alain@universite-de-strasbourg.fr> writes:
>>>I've heard/read several times that byte-code micro-optimizations are not
>>>worth the trouble.
...
>This is not directly related to the paper I mention later. I was talking
>about optimizing bytecode vs. compiler optimizations. I know of no
>interpreter doing elaborate static byte-code optimization.


If I understand you correctly, you mean optimizations that the
compiler that generates "byte code" performs, e.g., stuff like partial
redundancy elimination.


I expect that these optimizations are as effective for virtual machine
code as for native (i.e., real-machine) code, but if you want to go to
these lengths, you use a native-code compiler. And for systems that
uses a JIT compiler (i.e., a two stage process: source -> VM (aka byte
code) -> native code), the preferred place for putting these
optimizations is in the second stage (probably because it enables
optimization decisions with consideration of the target machine).
There have been some efforts to have analysis at the source code level
(or anyway, before JIT compilation), and embed the results as optional
component in the .class file to speed up JIT compilation, but has this
made it into production systems?


Otherwise: I dimly remember optimizations by Prolog compilers that
generate WAM (Warren abstract machine) code.


>>>https://ieeexplore.ieee.org/document/7054191


https://hal.inria.fr/hal-01100647/document


>I'm glad it works for you.


What's "it"? Anyway you miss the point: The paper suggests that one
should just write a switch-based interpreter and that more advanced
techniques are no longer needed. My results disprove this, on the
same hardware that they base their claims on. Branch mispredictions
may play a smaller role now than they used to, but apparently there
are other reasons that make the more advanced techniques still very
profitable.


This was somewhat surprising for me, too. We also did some work with
simulations of more advanced branch predictors in this context
[ertl&gregg03jilp], so I expected the performance benefits of our
advanced techniques to diminish significantly when the hardware
acquires such techniques, but I never really saw that happen. And
that's even on hardware that has very good indirect branch prediction
(as Rohou et al. showed).


@Article{ertl&gregg03jilp,
    author = {M. Anton Ertl and David Gregg},
    title = {The Structure and Performance of \emph{Efficient}
                                    Interpreters},
    journal = {The Journal of Instruction-Level Parallelism},
    year = {2003},
    volume = {5},
    month = nov,
    url = {http://www.complang.tuwien.ac.at/papers/ertl%26gregg03jilp.ps.gz},
    url2 = {http://www.jilp.org/vol5/v5paper12.pdf},
    note = {http://www.jilp.org/vol5/},
    abstract = {Interpreters designed for high general-purpose
                                    performance typically perform a large number of
                                    indirect branches (3.2\%--13\% of all executed
                                    instructions in our benchmarks). These branches
                                    consume more than half of the run-time in a number
                                    of configurations we simulated. We evaluate how
                                    accurate various existing and proposed branch
                                    prediction schemes are on a number of interpreters,
                                    how the mispredictions affect the performance of the
                                    interpreters and how two different interpreter
                                    implementation techniques perform with various
                                    branch predictors. We also suggest various ways in
                                    which hardware designers, C compiler writers, and
                                    interpreter writers can improve the performance of
                                    interpreters.}
}


- anton
--
M. Anton Ertl
anton@mips.complang.tuwien.ac.at
http://www.complang.tuwien.ac.at/anton/


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.