Javad: A Java Class File Disassembler

This web page publishes javad, a Java class file disassembler. This program is published in source form. The javad program is written in Java (as Sun would say, 100% pure Java, compile once, run anywhere - pass the coolaid).

Why would you be interested in javad?

The javad program is a tool for understanding the Java class file format.

Java class files contain most of the original symbol information that existed in the Java source. When compiling a Java class that references a class outside of the current file, the Java compiler uses the symbol information in the class file to resolve local references and to do semantic checking. For example, if the Java compiler is compiling the file MyClass.java, which imports the class FooClass, the Java compiler will look for symbol information in the class file FooClass.class.

Java source level debuggers, Java source browsers and, of course, Java virtual machines, must all read class files and build internal data structures on the basis of the class file information.

The javad program operates much like Sun's javap class file disassembler. That is, javad reads a class file and outputs a pseudo-Java declaration for the class that was the source for the class file. Unlike Sun's javap, the javad program is published in source form to serve as a reference for reading class files.

Software Download

The javad source code currently consists of about 4K lines of Java source and comments. It can be downloaded as a gzipped tar file by clicking here.

If you are using a Windows NT system and you don't have gzip and tar you can down load them here (note: I have not tested gzip/tar on Windows 95/98). This code is courtesy of Cygnus and is free software.

This software is not open source. It is covered by the following copyright. I discuss my reasons for using this copyright at length here.

Unpacking the software

  1. The source for javad will probably placed in the destination directory as javad_tar.gz. Execute the command gzip -d javad_tar.gz. This will uncompress the file into javad_tar.

  2. To unpack the "tar" file execute the command tar xvf javad_tar. This will create the directory javad which contains the source tree for javad.

Building javad

One nice thing about Java is that it does not necessarily need a "Makefile" to build a source tree. To build javad compile the ile javad.java with your local Java compiler (e.g, Sun's javac, Microsofts jvc. For example:

  javac javad.java
  jvc javad.java

The packages that are used by javad reference each other, so you may have to set our class path to refer to both the local directory and to the next directory up. Here is how I've set my class path:

CLASSPATH=.;..;E:\jdk1.2.2\jre\lib\rt.jar

Running javad

The javad program takes a list of one or more class files. To run javad enter:

   java javad.main MyClass.class

or

   jview javad.main MyClass.class

javad documentation

Software source code encapsulates information in the same way that a book does. In the case of software this information is designed to be read by both compilers and people. The challenge for the author of anything more than a trivial piece of software is to make the software design and implementation understandable to humans. No piece of software that is larger than a few hundred lines is "self documenting". This is a canard put forward by software engineers who don't want to spend the time to document their software source.

I hope that from the paragraph above it is clear that I believe strongly in documenting code. Over time I have come to believe that the implementation documentation should be included with the code itself and maintained with the code as comments. As the code changes, the documentation should change as well. Documentation is rarely complete (this is certainly true of the javad documentation) and documentation should be added to over time to explain pieces of code that seemed clear when they were written but were revealed as obscure when they were read weeks or months later.

In the case of javad much of the documentation is contained in Chapter 4 of The Java Virtual Machine Specification, Second Edition, by Tim Lindholm and Frank Yelling, Addison Wesley, 1999. This chapter specifies the Java Virtual Machine (JVM) class file format. The comments in the javad source discuss how the source code relates to this file format and discusses areas where I found the JVM Specification either incorrect or obscure.

One piece that is largely missing in the programmer's tool box is a software tool that will read documented source code and turn it into a document that can be used to explain the algorithms or software structure. Some programs, like Knuth's CWEB, create beautifully typeset documents, but clutter the source code with typesetting commands to the extent that it harms readability. Sun's javadoc reads Java source and does a nice job of creating API documentation. It is less useful for documenting programs. But I have not found a better tool which is free (e.g., without paying a license fee).

Javadoc generated documentation for Javad are available here.

Keith Johnston's ANTLR based javasrc program generates an HTML version of a Java source tree that includes cross reference HTML links. This is published here.

Other Java Documentation Software

I did not find Sun's javadoc very easy to use. Its great for documenting class librarys, but it is hard to get it to develop a documentation hierarchy for a program like javad. I looked at a number of documentation tools, both in the public domain and for a license fee. Right now I can bring myself to pay a license fee for a documentation tool. Here are some notes on other documentation tools.

Future Directions

The javad program reads the Java class file code attribute associated with class methods. It builds an object that contains the Java byte codes, but currently does not do anything with the byte code.

A later release of the javad program will read the Java byte codes in the method code attributes and print the equivalent abstract syntax trees. This will serve as a prototype for a compiler input phase.

Java byte code can be converted into an abstract syntax trees (ASTs) by creating software that acts like the Java virtual machine. Instead of executing the Java code, it generates trees. For example, consider the following pseudo-byte codes:

push A
push B
iAdd
store C

If these pseudo-byte codes were executed on a stack machine like the JVM, they would push A onto the stack, push B onto the stack and then add the top of stack (tos) and tos-1 values. The result would be left on the top of stack. The top of stack would then be stored into C by the store C instruction.

To generate ASTs from the byte code stream, a stack would be used that would hold AST nodes (e.g., AST "leaves" like identifiers and constants and AST operators like "+"). The execution of the above byte codes would push an AST node for symbol A and an AST node for symbol B onto the stack. The execution of the iAdd would "reduce" the stack and create the tree

         +
        / \
       A   B

The tree rooted in "+" would be put back on the stack. The "store C" operation would result in the creation of the tree

       =
      / \
     C   +
        / \
       A   B

Since this is a statement in Java it would be added to the statement list.

Blocks of statements without branches are referred to as basic blocks by compiler designers. By following the scheme above, basic blocks can be generated. Java branch operations result in the construction of a control flow graph of basic blocks.

Once the Java byte code stream has been "decompiled" into a control flow graph of basic blocks it is possible to further decompile the Java byte codes back into java source.

If the Java byte code stream is "decompiled" into a control flow graph that is the same as the control flow graph generated by a Java front end, then a Java compiler actually has two options: Java source code or Java class files. After a class file is read into a control flow graph it can be optimized and native code can be generated. This allows a Java compiler to process Java source and class files for which source is not available.

A Java to native compiler must be able to read Java class files to get get symbol information on classes and interfaces that are imported into the file being compiled. It must also be able to generate a flow graph from the byte code stream so that it can compile class files produced by Java byte code compilers. Since parsing and semantic analysis is also a big task, many compilers simply use class files as their input. The Sun Java compiler is available at no cost, so this is not seen as a burden on the user. Java compilers are discussed at greater length on my Web page on compiling Java.

Related links

Bill Venners, author of Inside the Java 2 Virtual Machine has a great web site with lots of information on the JVM, Java and Jini (the coolest Java technology yet released). The site is named after Venners' consulting company, Artima Software. You can click on the icon below to go to the Artima site.

Get
Java and Jini resources at artima.com

Ian Kaplan, January 24, 2000
Revised: April 26, 2004


back to Java page