Facts about the Java class file format

Markus Pilz <pilz@ifi.unizh.ch>
17 Oct 1998 01:59:35 -0400

          From comp.compilers

Related articles
Facts about the Java class file format pilz@ifi.unizh.ch (Markus Pilz) (1998-10-17)
Re: Facts about the Java class file format tlh20@cam.ac.uk (Tim Harris) (1998-10-21)
Re: Facts about the Java class file format jgm@CS.Cornell.EDU (Greg Morrisett) (1998-10-24)
Re: Facts about the Java class file format monnier+comp/compilers/news/@tequila.cs.yale.edu (Stefan Monnier) (1998-10-30)
Re: Facts about the Java class file format Jan.Vitek@cui.unige.ch (1998-10-30)
Re: Facts about the Java class file format pilz@ifi.unizh.ch (1998-10-30)
Re: Facts about the Java class file format albaugh@agames.com (1998-11-01)
[4 later articles]
| List of all articles for this month |

From: Markus Pilz <pilz@ifi.unizh.ch>
Newsgroups: comp.compilers
Date: 17 Oct 1998 01:59:35 -0400
Organization: Department of Computer Science, University of Zurich
Keywords: Java, comment

The wide acceptance of Java as a network programing language has made
the Java class file to one of the most popular, portable intermediate
program representations. Such a representation must be as small as
possible and still ideally support interpretation and code generation.
We have analyzed 4016 different class files for size and bytecode
usage and we have found a couple of interesting facts:

  o Java class files are small in the average. We found that 50% take
      less than 2'000 bytes, 80% less than 6'000 and 95% less than 13'000
      bytes. However we found that complete programs also contain a few
      files as large as 365'470 bytes. The size of the Java class file is
      important because the time to read the file is the dominant factor
      in the start-up time.

  o The biggest parts of the Java class files is the constant pool (61%
      of the file) and not the method pool that accounts for only 33% of
      the file size. The other parts of the class file share the remaining

  o About 32% of the size of the file is constituted by unessential or
      debugging information, such as the name of the source file or tables
      associating offsets into the bytecodes to line numbers in the Java
      source code. The file can safely be reduced to 70% of its size and
      stay perfectly functional.

  o On the average, the bytecodes take only 12% of the class file and
      only 18% of the class file stripped of its superfluous content.

  o The average size of an instruction is slightly less than 2 bytes.

  o The classes typically use 25 different instructions and at most 113
      instructions, when 212 are defined.

  o The frequencies of the instructions used vary considerably. There
      are five instructions that individually account in at least two
      programs for more than 5% of the occurrences.

  o The theoretical minimum average number of bits needed to encode the
      opcode is 4 bits instead of the 8 or 16 used today.

This all indicates that the Java class file is far from an ideal
format. For a detailed description check out our technical report:

  Denis Antonioli, Markus Pilz. Analysis of the Java Class File Format.
  Technical report ifi-98.05, ifi, University of Zurich, April 1998



  email: pilz@ifi.unizh.ch Markus Pilz, University of Zurich
  voice: +41-1-635 67 12 Department of Computer Science
  fax: +41-1-635 68 09 Winterthurerstr. 190, CH-8057 Zurich
  www: http://www.ifi.unizh.ch/~pilz
[Personally, I'm quite happy to have that debug stuff in there to help
figure out why my program doesn't work. We already have too many
undebuggable object formats. -John]

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.