Generating optimal code from a DAG

Michael Weitzel <weitzel@simtec.mb.uni-siegen.de>
11 Apr 2005 00:19:00 -0400

          From comp.compilers

Related articles
Generating optimal code from a DAG weitzel@simtec.mb.uni-siegen.de (Michael Weitzel) (2005-04-11)
Re: Generating optimal code from a DAG touati@prism.uvsq.fr (TOUATI Sid) (2005-04-16)
Re: Generating optimal code from a DAG tmk@netvision.net.il (2005-04-16)
| List of all articles for this month |

From: Michael Weitzel <weitzel@simtec.mb.uni-siegen.de>
Newsgroups: comp.compilers
Date: 11 Apr 2005 00:19:00 -0400
Organization: Lehrstuhl =?ISO-8859-1?Q?f=FCr?= Simulationstechnik, =?ISO-8859-1?Q?Universit=E4t?= Siegen, FB11/FB12
Keywords: code, optimize
Posted-Date: 11 Apr 2005 00:19:00 EDT

Hi,


I am currently working on a code generator for the Intel x86 FPU
(x>=3) which generates code for single expressions which can be very
large and highly redundant (in terms of common subexpressions). My
code generator creates a DAG from the expression tree to eliminate
common subexpressions. The DAG is split into trees and code is
generated for common subtrees (using the algorithm from Bruno and
Lasagne to obtain optimal stack-machine-code). After that, some
peephole optimization replaces (only) pairs of opcodes by single
opcodes using the Intel's opcodes.


I understand that the generated code is not optimal:
  - peephole optimization frees up to one level of the stack that could
      be used to generate slightly better code for the sub-trees
  - results of sub-tree computations could be stored in the FPU's stack
      (the Intel FPU can access every level of the stack at any time) -
      instead I always save them to local variables (in the function's
      stack frame).
  - the FPU-stack is not working to capacity when switching from one
      sub-tree to the next and the computation of a sub-tree always starts
      on an empty stack.
  - visiting the DAG nodes in topological order leaves some option to
      delay or prefer code generation for nodes on the same topological
      "level"


In the texts I read, these problems (and probably others I forgot) are
usually summarized (or "slayed"?) in a sentence like "Code generation
from DAG is NP-complete.". ;-)


I haven't seen any proof or exact description of the problem yet. What
makes code generation from DAG NP-complete? I have some ideas -- but how
does an optimal algorithm work?


Are there better algorithms than splitting a DAG into trees? Perhaps I
can effort using some NP-complete procedure on parts of a DAG
(heuristic)?!


Thanks!
--
Michael Weitzel


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.