Re: Help needed: Code generation for CASE/SWITCH statements

Henry Spencer <henry@zoo.toronto.edu>
15 Dec 1997 22:00:48 -0500

          From comp.compilers

Related articles
[9 earlier articles]
Re: Help needed: Code generation for CASE/SWITCH statements wi534@victoria.tc.ca (William A. Barath) (1997-12-07)
Re: Help needed: Code generation for CASE/SWITCH statements sethml@ugcs.caltech.edu (1997-12-07)
Re: Help needed: Code generation for CASE/SWITCH statements henry@zoo.toronto.edu (Henry Spencer) (1997-12-10)
Re: Help needed: Code generation for CASE/SWITCH statements a_s_t_o_r@guardian.no (Alexander Kjeldaas) (1997-12-10)
Re: Help needed: Code generation for CASE/SWITCH statements vonbrand@inf.utfsm.cl (Horst von Brand) (1997-12-12)
Re: Help needed: Code generation for CASE/SWITCH statements wi534@victoria.tc.canada (William A. Barath) (1997-12-12)
Re: Help needed: Code generation for CASE/SWITCH statements henry@zoo.toronto.edu (Henry Spencer) (1997-12-15)
Re: Help needed: Code generation for CASE/SWITCH statements dietz@interaccess.com (Paul Dietz) (1997-12-16)
Re: Help needed: Code generation for CASE/SWITCH statements kanze@gabi-soft.fr (1997-12-16)
Re: Help needed: Code generation for CASE/SWITCH statements hayes@epigram.com (Raymond Hayes) (1997-12-17)
Re: Help needed: Code generation for CASE/SWITCH statements wi534@victoria.tc.canada (William A. Barath) (1997-12-23)
| List of all articles for this month |
From: Henry Spencer <henry@zoo.toronto.edu>
Newsgroups: comp.compilers,comp.lang.c.moderated
Date: 15 Dec 1997 22:00:48 -0500
Organization: SP Systems, Toronto
References: <clcm-19971204-0012@plethora.net> 97-12-051 97-12-065 97-12-100
Keywords: optimize, architecture

William A. Barath <wi534@victoria.tc.canada> wrote:
>|In particular, modern high-speed machines really hate conditional
>|branches...
>
>How so? If the code is called repeatedly (iterative model, one which
>is _worth_ optimising) then the CPU's instruction cache is very likely
>to keep all the relevant data cached, and branch instructions can be
>executed in 1 clock or less...


On ten-year-old first-generation RISC machines, perhaps. Not on
*modern* *high-speed* machines. Modern high-speed machines do heavy
pipelining and prefetching, so the impact of a conditional branch is
either roughly zero when the machine guesses right, or quite
substantial -- a pipeline flush and refill -- when it guesses wrong.
Various tricks are employed to reduce the impact, but a decision tree
is just about the worst possible case: a stream of data-dependent
decisions made in fast succession, with little or no intervening
computation to let the pipelines recover. Unless the program is
constantly hitting the same choice, and the machine is smart enough to
exploit this, some pipeline breaks are almost guaranteed.


The two-level table I mentioned does require memory references, but if
it is heavily used -- the situation we care about -- and the range of
selector values is not too enormous, the first-level table is likely
to be in the cache, and possibly even some hot spots in the
second-level tables will be. True, there is a performance price for
even cached memory references, but there is also a performance price
for pipeline breaks, which this approach avoids. The absence of
conditional branches also makes it easier for later optimizations to
spread the table-lookup code out in hopes of hiding the memory-access
delays.


Which is better? Depends on the machine. But the tradeoff needs to
be analyzed, not just dismissed. The two-level table approach can be
quite competitive.


>Using a table absolutely requires more
>than one clock, and given sparse indexes means a large performance hit
>both in code bloat (size of executable) as well as cache load hits.


The point of using a two-level table (essentially a simple trie -- a
first-level table, indexed by a subfield of the case selector,
containing pointers to the second-level ones, which are indexed by the
rest) is that a sparse two-level table does not have to be large, and
hence difficult to cache, because the second-level tables can be
shared. In particular, many of the first-level-table entries
typically will point to the same second-level table, the one that
consists entirely of "take the default choice" entries.
--
| Henry Spencer
| henry@zoo.toronto.edu
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.