Re: optimization data structures, wasalternatives to Java byte-codes

David Chase <chase@world.std.com>
10 Feb 1999 16:04:45 -0500

          From comp.compilers

Related articles
Re: alternatives to Java byte-codes jerpat@iastate.edu (1999-02-05)
Re: alternatives to Java byte-codes kistler@ics.uci.edu (Thomas Kistler) (1999-02-07)
Re: optimization data structures, wasalternatives to Java byte-codes chase@world.std.com (David Chase) (1999-02-10)
| List of all articles for this month |

From: David Chase <chase@world.std.com>
Newsgroups: comp.compilers
Date: 10 Feb 1999 16:04:45 -0500
Organization: NaturalBridge LLC
References: 99-02-023 99-02-029
Keywords: analysis

Thomas Kistler wrote:


> I strongly have to object. AST are much more convenient for
> optimizations since all the high-level information about loops, array
> accesses, type information, etc. is available in AST's. However, in
> byte-codes all this information gets lost in the translation process
> to byte-codes.


The difficulty with this line of reasoning is that it only
holds for a certain amount of optimization of certain languages.
Adding irreducible control flow (either introduced via "goto" or
by tail-call elimination) means that you have control flow
information that is not directly reflected in the AST structure.
Your compiler could punt, of course, or it could attempt an
analysis to recover the information from the lower-level control
flow. However, that recover-the-information step can be equally
well applied to bytecodes, and it isn't that difficult.


> Therefore it has to be reconstructed from the low-level
> structure which can be extremely hard and time-consuming in many
> cases.


No. First, in the case of reducible control flow (what is
expressed in the AST) the analysis is quite easy and near-linear
(I'm being vague here because I haven't paid careful attention
to the latest-and-greatest algorithms). Second, I don't think
it's that hard even in the irreducible case. Here's a general
sketch of how you might do it (and I am surely reinventing some
wheel, probably Sharir's wheel, now that I think about it):


    1. build a dominator tree D from the control-flow graph.


    1a. annotate each node with its in-and-out numbers in a
            left-to-right walk.


            int visit(node n, number i) {
                node.in := i;
                i := i+1;
                for children c of n do {
                    i := 1 + visit(c, i);
                }
                node.out := i;
                return i;
            }


            You'll use this later, to determine the ancestor relationship
            between pairs of nodes:


            boolean is_ancestor_of(n1,n2) {
                return n1.in <= n2.in && n2.out <= n1.out;
            }


    2. consider the back-edges in the original graph and
          how they are arranged among the nodes of the dominator
          tree. For each back-edge (source, sink), compute the
          near-common-ancestor A in D of source and sink. This
          is the loop header for that backedge. Note given a
          recursion-converted-to-iteration walk of the tree,
          the spine of the tree from root to source is always
          available, and it can be binary-searched for the
          NCA of some other node:


          (this binary search NOT checked for fencepost errors)
          int nca_index(node[] spine, node x) {
                int lo := 0; // known to be an ancestor
                int hi := spine.length; // known not to be an ancestor
                while (hi - lo > 1) {
int mid := (hi+lo)/2;
                      node mid_node := spine[mid];
                      if is_ancestor_of(mid_node, x) lo := mid;
                      else hi := mid;
                }
                return lo;
          }


3. At this point, given loop headers and corresponding sets
      of back-edges, you can make an initial sorting as to which
      loops nest within other loops. If loop A is nested within
      loop B, then the header for B must dominate A; this is a
      necessary, but not sufficient, condition. To identify
      the nodes within a given "loop" it is sufficient simply
      to walk backwards (up control-flow predecessors) from
      back-edges until the corresponding header is reached.
      To avoid re-scanning the same nodes over and over again,
      start scanning from the deepest loop headers in the
      dominator tree; this will include all the innermost loops.
      When scanning for the nodes a less-inner loop C, if an
      already-marked node (from a different loop D) is encountered,
      then D is nested within C, and it is sufficient to skip
      to the header node of D and resume the search from its
      predecessors (which may, in turn, also be already marked).


      This rescaning and skipping is the only potentially expensive
      part of this algorithm, and
      (a) there's probably a better way.


      (b) I think there is a forwarding algorithm that might
              be employed from the inner headers that will let you
              skip out quickly.


      (c) The sort of control-flow that makes this go badly wrong
              is rare (it requires extreme loop nesting).


If you look, I think you'll find that ignoring the node rescanning
to determine loop nesting, the rest is order


      min( #edges, #nodes * log (#nodes))


(at least, I think I have that right) and that there's no
horrible constant factors lurking in there.


> Needless to say that high level information is very important
> for aggressive optimizations.


Among the aggressive optimizations that come to mind, the
only one where source-level information is useful is anti-aliasing
of array bounds in Fortran. There, it is possible to assert that


      A(I,J) notalias A(M,N) if either I != M or J != N


When reduced to a low level, this is an arithmetic mess that is
much harder to decipher. But, that is the only example that comes
to mind.


And, of course, if you are doing aggressive optimization, you can
afford to recover the structural information, and will probably do
so anyway since the code may contain things like redundant induction
variables.


> >AST's on the other hand can be easier to generate and interpret.
>
> I have to object again. AST's are not very convenient for interpretation
> because AST's usually consume a lot more memory than simple byte-codes.
> In many cases in which interpretation is chosen over execution of native
> code, memory plays a very important role.


I'm not convinced of this, but I don't have a ready example of
why ASTs would be an acceptable form for interpretation; I just
think it might work not-so-badly, given careful design.


> Here again, streaming is an orthogonal issue to your choice of
> intermediate representation. If you transfer a byte-code program in
> the order of declaration and the first method invokes the last method
> then you have to download the whole class file before you can execute
> the program. AST's can very well be designed for streaming. You can
> commence execution of an AST if the important sub-trees are transfered
> first.


Assuming, of course, that you have a language semantics that
allows partial execution of perhaps incomplete, badly-formatted,
or verification-failing methods. Java's a little loose on binding
times, but beginning execution of a method before the entire method
has been seen is very late binding indeed (though in principle, not
that different from beginning execution of a program before all
the methods have been loaded). The point is that streaming may not
be much of a choice.


I also think it is entirely possible to stream bytecodes; some
optimizing compilers, to some extent, attempt to do this for
instruction caches by placing the expected-to-execute path in
a straight line and all the rarely executed stuff (throwing of
exceptions, for instance) out in the boondocks.


David Chase
NaturalBridge LLC


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.