Re: General byte-codes reference

anton@mips.complang.tuwien.ac.at (Anton Ertl)
31 Dec 2000 03:03:02 -0500

          From comp.compilers

Related articles
[4 earlier articles]
Re: General byte-codes reference midkiff@watson.ibm.com (2000-12-11)
Re: General byte-codes reference Norman_member@newsguy.com (Norman Culver) (2000-12-18)
Re: General byte-codes reference brangdon@cix.compulink.co.uk (2000-12-18)
Re: General byte-codes reference patc@acm.org (Pat Caudill) (2000-12-18)
Re: General byte-codes reference sjmeyer@www.tdl.com (2000-12-20)
Re: General byte-codes reference midkiff@watson.ibm.com (2000-12-21)
Re: General byte-codes reference anton@mips.complang.tuwien.ac.at (2000-12-31)
| List of all articles for this month |
From: anton@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.compilers
Date: 31 Dec 2000 03:03:02 -0500
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
References: 00-12-030 00-12-073
Keywords: interpreter
Posted-Date: 31 Dec 2000 03:03:01 EST

  Norman Culver <Norman_member@newsguy.com> writes:
>It is possible to fit an entire interpreter into the L1
>cache (64 KB) of a 1 Ghz AMD but it won't fit into the 16 KB cache of
>a Pentium III.


That depends on the interpreter. E.g., the Gforth engine for the 386
architecture currently uses 16238 bytes. It contains more than 300
primitives; only a few of these are used frequently, see
http://www.complang.tuwien.ac.at/forth/peep/:


cumulative
dynamic
executions primitives


  90% 39
  99% 77
  99.9% 104
100% 152


I.e., 77 primitives make up the top 99% of the dynamic executions of
primitives, and in the benchmarks that this data is based on only 152
of the primitives were actually executed. I.e., the
frequently-executed part of the interpreter easily fits into 16KB.


Note that this is cumulated data over three benchmarks, the working
set for a stretch of time in one benchmark run will be biased towards
even fewer primitives.


Now, some of the primitives call library routines, and you might want
to include them in the interpreter size; however, there are only two
primitives in the top 99% that do calls (to long division routines and
to fwrite), and these make up for <0.6% of the executed primitives.


You may wonder what the other primitives are for: Many are for stuff
that you can do only as primitives in Gforth, even though they may be
rarely used, like performing the getrusage system call; many others
are for stuff that is not used in these benchmarks, e.g., FP.


Currently I cannot offer performance counter results for Gforth on the
Pentium-III, but the timings I have done indicate that the Pentium-III
is about as fast on Gforth as the Athlon of similar clock frequency,
for both small and large benchmarks. So the larger L1 cache does not
seem to give an advantage to the Athlon.


Taking a look at other people's work, [romer+96] show icache miss
cycles on a 21064 (8KB I-cache) for several interpreters; for the
MIPSI interpreter the I-cache miss cycles are less than 5% of the
total cycles for all benchmarks.


>The choice of byte codes is highly dependent upon the CPU architecture


Why do you think so?


- anton
--
M. Anton Ertl Some things have to be seen to be believed
anton@mips.complang.tuwien.ac.at Most things have to be believed to be seen
http://www.complang.tuwien.ac.at/anton/home.html


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.