Re: Selective Computation...

Andreas Gieriet <andreas.gieriet@externsoft.ch>
21 Jan 2003 00:03:02 -0500

          From comp.compilers

Related articles
Selective Computation... saru@jeyan.eclipse.co.uk (Saru) (2002-12-26)
Re: Selective Computation... jeyan@jeyan.eclipse.co.uk (Saru) (2002-12-30)
Re: Selective Computation... saru@jeyan.eclipse.co.uk (Saru) (2002-12-30)
Re: Selective Computation... joachim_d@gmx.de (Joachim Durchholz) (2002-12-31)
Re: Selective Computation... thp@cs.ucr.edu (2003-01-17)
Re: Selective Computation... Patrick.Volteau@st.com (Patrick Volteau) (2003-01-20)
Re: Selective Computation... strohm@airmail.net (John R. Strohm) (2003-01-21)
Re: Selective Computation... andreas.gieriet@externsoft.ch (Andreas Gieriet) (2003-01-21)
Re: Selective Computation... liekweg@freenet.de (Florian Liekweg) (2003-01-21)
Re: Selective Computation... marcov@toad.stack.nl (Marco van de Voort) (2003-02-06)
| List of all articles for this month |
From: Andreas Gieriet <andreas.gieriet@externsoft.ch>
Newsgroups: comp.compilers
Date: 21 Jan 2003 00:03:02 -0500
Organization: eXternSoft GmbH
References: 02-12-116 03-01-077
Keywords: architecture, performance
Posted-Date: 21 Jan 2003 00:03:01 EST

thp@cs.ucr.edu schrieb:
>
> Saru <saru@jeyan.eclipse.co.uk> wrote:
> + I have a computation which evaluates X as
> +
> + X=k1*(p+q)+k2*(p+q')+k3*(p'+q);
> +
> + where k1, k2, k3, p, q , p', and q' are variables (not constants).
> + Though "p", "q", "p'" and "q'" can take any arbitrary value,
> + k1, k2 and k3 can take either 0 or 1. Again, within k1, k2 and k3 only
> + one of them can be 1 at any given time. I am wondering there exists any
> + method to speedup/optimize this computation (in a way that could be done
> + on all architectures/processors). This computation appears inside a
> + loop and clearly I don't want to use any if-then-else.
>
> What's wrong with if-then-else, perhaps, in the form:
>
> X = ( k1 ? p+q : k2 ? p+q' : k3 ? p'+q : assert(0) );
>
> Tom Payne
> [On modern architectures, conditional branches can be slow. -John]


I think this would work (c-code):
    /*
      * assumption: k1,k2,k3 = one-hot (always exactly one is 1, the rest is 0)
      */
    register int m2 = -k2;
    register int m3 = -k3;
    register int x = ((p & ~m3) | (pt & m3))
                                  + ((q & ~m2) | (qt & m2));


One could even get rid of -k2 and -k3 if m2 and m3 is provided directly.


How to get there (c-code, assuming int):
    x = (-k1) & (p + q) + (-k2) & (p + qt) + (-k3) & (pt + q);


Factored out:
    x = ((-k1) & p) + ((-k1) & q)
        + ((-k2) & p) + ((-k2) & qt)
        + ((-k3) & pt) + ((-k3) & q);


Re-arranged:
  x = (p & (-k1)) + (p & (-k2))
      + (q & (-k1)) + (q & (-k3))
      + (pt & (-k3))
      + (qt & (-k2));


Combine:
  x = (p & ((-k1)|(-k2)) /* '+'->'|' follows from one-hot condition */
      + (q & ((-k1)|(-k3)) /* '+'->'|' follows from one-hot condition */
      + (pt & (-k3))
      + (qt & (-k2));


Replacing complexer terms:
  x = (p & (-!k3)) /* 'k1 or k2' is equal to 'not k3' */
      + (q & (-!k2)) /* 'k1 or k3' is equal to 'not k2' */
      + (pt & (-k3))
      + (qt & (-k2));


Simplyfy:
  x = (p & ~(-k3))
      + (q & ~(-k2))
      + (pt & (-k3))
      + (qt & (-k2));


Prevent multiple re-calculations:
  register int m2 = -k2;
  register int m3 = -k3;
  x = (p & ~m3)
      + (q & ~m2)
      + (pt & m3)
      + (qt & m2);


Further simplify:
  register int m2 = -k2;
  register int m3 = -k3;
  register int x = ((p & ~m3) | (pt & m3)) /* '+'->'|' follows from one-hot*/
                                + ((q & ~m2) | (qt & m2));/* '+'->'|' follows from one-hot*/




Which leads to the suggested solution.


Cheers,


Andi


--
Andreas Gieriet mailto:andreas.gieriet@externsoft.ch
eXternSoft GmbH http://www.externsoft.com/
Zurlindenstrasse 49 phone:++41 1 454 3077
CH-8003 Zurich/SWITZERLAND fax: ++41 1 454 3078


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.