Re: Why Virtual Machines? (was: C++ -> Java VM compiler)

haahr@netcom.com (Paul Haahr)
7 Feb 1997 23:31:08 -0500

          From comp.compilers

Related articles
[6 earlier articles]
Re: Why Virtual Machines? (was: C++ -> Java VM compiler) haahr@netcom.com (1997-02-02)
Re: Why Virtual Machines? (was: C++ -> Java VM compiler) nr@adder.cs.virginia.edu (Norman Ramsey) (1997-02-02)
Re: Why Virtual Machines? (was: C++ -> Java VM compiler) Bronikov@srv2.ic.net (Dmitri Bronnikov) (1997-02-02)
Re: Why Virtual Machines? (was: C++ -> Java VM compiler) apalanis@students.uwf.edu (1997-02-03)
Re: Why Virtual Machines? (was: C++ -> Java VM compiler) robison@kai.com (Arch Robison) (1997-02-03)
Re: Why Virtual Machines? (was: C++ -> Java VM compiler) bothner@cygnus.com (1997-02-07)
Re: Why Virtual Machines? (was: C++ -> Java VM compiler) haahr@netcom.com (1997-02-07)
Re: Why Virtual Machines? (was: C++ -> Java VM compiler) markt@harlequin.co.uk (1997-02-07)
Re: Why Virtual Machines? (was: C++ -> Java VM compiler) robison@kai.com (Arch Robison) (1997-02-11)
Re: Why Virtual Machines? (was: C++ -> Java VM compiler) gah@u.washington.edu (1997-02-22)
| List of all articles for this month |

From: haahr@netcom.com (Paul Haahr)
Newsgroups: comp.compilers
Date: 7 Feb 1997 23:31:08 -0500
Organization: NETCOM On-line services
References: <01bbfca0$a284a6f0$041b6682@tecel> 97-01-120 97-01-139 97-01-225 97-02-016
Keywords: architecture, Java

Norman Ramsey <nr@adder.cs.virginia.edu> wrote:
> Within ten days we've heard that Java bytecodes are so much like
> modern machines that it's easy to generate machine code on the fly,
> and so much like the source code that source-level analyses are easy.


My categorization is that Java bytecodes are close enough to real
hardware that it's easy to generate poor-quality machine code quickly,
and close enough to source that a high-quality bytecode to native
compiler has to do approximately as much work to do as a native
compiler for a source language like Java, after parsing.


The question I would raise is ``What information is lost, what is
preserved, and what is added when compiling from Java source text to
JVM classfiles?''


Off the top of my head, what is lost is:


    - comments
    - line numbers and local variable names (though compilation for
        debugging leaves these in)
    - structured control flow


What is added is:


    - assignment of stack and local variable indices to local variables,
        which may merge variables (similar to register assignment for
        conventional compilers)
    - depth of stack and number of local variable slots are made concrete
    - operators (+) are made specific (iadd)
    - overloaded function calls are resolved to specific signatures
    - class names are resolved to fully qualified names


Almost everything else is preserved from the source. (Future
compilers might change the information more by, say, doing more
aggressive common subexpression elimination or loop unrolling than
found in Sun's javac.)


Significantly, type information is completely preserved. I'd cite
this as the fundamental reason why Java decompilers are quite as
plentiful as they are and C decompilers are relatively uncommon
beasts.


Note that the replacement of Java's high-level control structures (if,
while, try, etc) with branches and exception tables probably makes
good compilation of bytecode a little more difficult than compilation
of source. That is, many optimizations are easier when dealing with
structured control flow. For example, Brandis & M\:ossenb\:ock's
method for generating the static single assignment form of structured
programs is much more pleasant than the classic Cytron, Ferrante, et
al, approach or Sreedhar & Gao's DJ-graph code.


So, despite bytecodes being ``closer'' to real machine code, source
may be easier for a high-quality compiler to work from. On the other
hand, techniques for structuring goto-based code are at least two
decades old, and still found in the scientific programming world.
Since there's a presumption that JVM code comes from a structured
language, trying those techniques would probably be profitable.


> My cursory look at the JVM spec reminded me an awful lot of Smalltalk,


The bytecodes may be, but the amount of type information present in
the class file is distinctly un-Smalltalkish.


> so I'm not ready to swallow either claim, but I would love to be
> convinced.


Since this is mostly an issue of definition and perspective, trying to
convince is mostly an issue of rhetoric and not all that interesting
technically.
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.