All NamedObjects contain 4 basic fields:
| name | A NameNode that gives the qualified name of this entity. |
| container | Refers to the principle container that the entity belongs to, the container named by NAME's qualifier. |
| cunit | A system variable used for access to global variables such as the root package, and the primitive types. |
| source | The object which produced this name, which may be either from the source or bytecode representations. |
R.lookupClassfile (java.lang.String)
will return the object representing the bytecode found in String.class. The fourth is a (perhaps misleading) abbreviation that attempts to find the public reference type declaration for a named class. Rather than return the ClassFile representing java.lang.String, for example, lookupRefdecl returns an object of type ReferenceDecl (either ClassDecl or InterfaceDecl, except in the case of an error, in which case it returns ErrorDecl)--instead of returning the class file, it returns the class itself. There is little distinction in the case of bytecode, which contain only one class, but the difference is pronounced in source files where multiple classes are present. LookupRefdecl is equivalent to:
ClassFile cf = R.lookupClassfile (java.lang.String);
ReferenceDecl rd = cf.lookupMain ();
Note that it chooses to search the bytecode, not source. This is arbitrary (and distasteful). The lookupMain routine, in this case, searches for a reference decl named "String" in the container java.lang.String.
Packages contain other packages, classfiles, and compunits. The next level of hierarchy begins inside individual classfiles and compunits. These contain one or more reference decls corresponding to the reference types (class or interface types) declared within. In addition, compunits also contain unqualified type names that were imported into the scope. For example, the file Foo.java containing:
package P;
import java.util.Hashtable;
class Foo { }
interface Bar { }
produces a CompUnit that contains the bindings:
| NamedObjectLookup lookupConstructor() | Returns all constructors. |
| NamedObjectLookup lookupField(String name) | Returns all fields matching name. |
| NamedObjectLookup lookupMember(String name) | Returns all members matching name. |
| ConstructorDecl lookupConsdecl(String name, String sig) | Search for a constructor decl: a null sig means search for a unique constructor decl. |
| MethodDecl lookupMethdecl(String name, String sig) | Search for a method decl: a null sig means search for a unique method decl. |
A lookup operation may search all the enclosing scopes of a container, or it may only search the first scope--just the container itself and not its parent containers. When deciding what the "next" enclosing scope to use it, the following rules is applied:
A Validator is a predicate that restricts the result of a lookup to
a subset of the class hierarchy. For example, the validator AstObject.vldtr()
is unrestrictive, while Package.vldtr() restricts the result to Package
objects.
| NamedObjectLookup lookup(String name) | Lookup named object in enclosing scopes. |
| NamedObjectLookup lookupValid(String, Validator) | Lookup named object in enclosing scopes. |
| NamedObjectLookup lookupFirst(String, Validator) | Lookup named, valid object in first scope. |
| NamedObjectLookup lookupAll(Validator) | Lookup all valid objects in first scope. |
| NamedObjectLookup qlookup(NameNode) | Qualified lookup in a named container object. |
| NamedObjectLookup qlookupValid(NameNode, Validator) | Qualified, validated lookup in a named container object. |
There are four more specialized lookup routines that are extensions
of the above. Already mentioned, lookupMain() searches for the public class
or interface declaration in a class or source file. lookupSuper() returns
a class or interface declaration corresponding to the superclass or super-interface.
The two lookup routines dealing with types produce type objects instead
of objects in the class hierarchy (class and source files)--this will become
clear when types are discussed next. lookupAllTypes() returns a type object
for each distinct entry in a package, and lookupType() returns a specific
type by name.
| ReferenceDecl lookupMain() | Returns the main class or interface. |
| ReferenceDecl lookupSuper() | Search for the first enclosing superclass. |
| NamedObjectLookup lookupAllTypes() | Returns all Types. |
| Type lookupType(NameNode) | Returns a Type. |
Reference types include arrays, classes, interfaces, and the special null type. Class and interface types are represented using lazy bindings--holding one of these objects does not actually load the corresponding class or source file. For this reason, equality on types must be tested using the equals() method, not the equality operator. Since just the name of a type does not indicate whether it is a class or interface, both are initially represented as the superclass ClassOrInterfaceType. Within ClassOrInterfaceType, there are three classes: ErrorType, LazyType, and ResolvedType. These three represent the cases in which a type is unknown due to an earlier error in processing, a type that did not immediately have its declaration loaded, and a type that has already resolved to some declaration. A ClassOrInterface object can be queried for its corresponding declaration using the asDecl() method, which forces the underlying object to be loaded and returns a ReferenceDecl. This returns a ClassDecl, InterfaceDecl, or in case of an error, an ErrorDecl. Now it is clear why the lookupAllTypes() and lookupType() methods are special, they translate class and source files into LazyTypes before returning their result.
The method type is used for organizing the parameter and return types of a method. It does not represent an actual Java language type, however it can be treated as such for most purposes, including signature generation.
The primitive types, as well as the null and void types, are accessible through the ever-present cunit() object. For example, cunit().system().booleanType() refers to the boolean type.
There are various routines for learning information about types, including:
| asCpReference() | Class file specific: returns a constant pool entry refering to this object. |
| asReferenceType() | Return a reference type. For primitive types it returns the corresponding type from java.lang, such as java.lang.Integer. |
| equals(AstObject) | True if two type objects are equal. You cannot use the equality operator because type objects are not unique. |
| isAssignableFrom(Type) | True IFF this type may be assigned from the argument. |
| isCallableWith(Type) | True IFF the argument type type may be method invocation converted to this type. |
| isCastableFrom(Type) | True IFF this type may be casted to from the argument. |
| isIntegral() | True IFF this type is integral (byte, char, short, int, long). |
| isNumeric() | True IFF this type is numeric. (integral, float, double). |
| primType() | Return the primitive's type, an enumeration (PrimType) found in the C header hc.h. |
| promoteTo(Type) | Promotion of this type another. For promoting this type to the least upper bound of it and the argument, except it doesn't really work right because it does not handle reference types. |
| signatureToType(String, ClassHierarchy) | (Static method) Convert a type signature to a Type. The first argument is the signature, and the second is used for looking up types. |
| toString() | Converts to string representation, for printing. |
| typeSignature() | Returns the type signature of this type, the encoding used in bytecode files. |
| width() | 2 if long or double, otherwise 1. This is used in bytecode verification and classfile generation. |
For example, the type ConstantPoolClass has one field, cpClassname that returns a slashified class name, e.g. "java/lang/Object". The special method cpAsType() returns a ResolvedType object for the class. The ConstantPoolString object represents a Java-language string constant. Numeric constants are represented using ConstantPoolDouble, ConstantPoolFloat, ConstantPoolLong, and ConstantPoolInteger. The ConstantPoolFieldRef, ConstantPoolMethodRef, and ConstantPoolInterfaceMethodRef types each have 3 associated strings: the class name being referenced, the field or method name being referenced, and the type signature of that object.
Fields and methods are accessible through the ClassFile object's cfFields()
and CfMethods() methods. Alternatively, the source() field of a FieldDecl
or MethodDecl (see the 4 basic NamedObject fields) will return the corresponding
CfFieldInfo or CfMethodInfo object. Both of these objects inherit from
CfMemberInfo, which has the following two string fields:
| cfName | The name of this member, a string |
| cfDescriptor | Misnamed: this is the type signature |
| cfInit | A constant pool entry |
| cfCode | The actual bytecodes for this method, of type Code |
| cfExceptionThrows | The exception types that can be thrown |
Instructions are represented by a doubly linked list. Markers and positions are represented using special, non-existent instruction types. The Code object has methods for retrieving the first and last instruction, as well as for obtaining markers at the beginning, end, or middle of the sequence.
There are 13 different instruction formats; each is represented by its
own type.
| BipushInstruction | Only the bipush (push byte) instruction uses this format |
| SipushInstruction | Only the sipush (push short) instruction uses this format |
| IincInstruction | Only the iinc (increment) instruction uses this format |
| InvokeIInstruction | Only the invokeinterface instruction uses this format |
| MultiNewInstruction | Only the multianewarray instruction uses this format |
| NewArrayInstruction | Only the newarray instruction uses this format |
| JumpInstruction | All the branch instructions use this format |
| LoadCPInstruction | Instructions that load from the constant pool |
| LoadLVInstruction | Instructions that load from a local variable slot |
| NoargInstruction | Instructions with no operands |
| SwitchInstruction | A tableswitch or lookupswitch instruction |
| LabelInstruction | Internal use |
| MarkerInstruction | Internal use |
There is an example of how to use this interface to re-write a Java class file contained in the Java classes:
The interface does allow for the abstract syntax tree to be traversed and examined. To do this requires an understanding of the class hierarchy, which is quite large. There is one class per node type in the abstract syntax tree, approximately 130 node types in all. The root of all nodes in the AST is a TreeNode. The easiest of these nodes to understand are the statement and expression nodes--they map directly onto the language constructs that you are already familiar with. The remainder of the nodes represent various structural and syntactic components of the compilation unit, such as import statements, class and interface declarations, formal parameters, field and method declarations, names, and types.
The Javadoc produced class hierarchy
is especially useful in understanding these nodes. Each TreeNode
has at least two fields:
| position | location of the construct within the source file |
| cunit | the containing compilation unit |
| method | an expression (ExprNode) yielding the method to call |
| args | a list of expressions (ExprNodes) provided as arguments |
Lists are represented by a special tree node type called a TreeListNode.
It has variable arity and its fields are not named, they are only numbered.
There are generic methods for accessing the children of lists and any other
tree node by their index. These are:
| arity() | the number of children of the node |
| child(N) | the Nth child |
| children() | an array of children of length arity() |
Some tree nodes may absent. When the grammar allows for an optional
construct, nodes are left absent to indicate that fact. Two methods can
be used to test for absence:
| absent() | true if the node is absent |
| present() | true if the node is not absent |
An easy way to get a feel for the node structure is to print it. There is a simple Java program to prints the node hierarchy out in the Java class weld.PrintTree. See the documentation there. More elaborate implementations could surely be made.
It now uses the CLASSPATH environment variable and can read zip files. The only catch is that you still have to put classes.zip into your CLASSPATH--the JDK automatically provides this variable in the its startup scripts.
To install the software on Windows, first you must have JDK 1.1 installed. Download the distribution (zip, tar.gz). There are two DLLs. The first is AST.DLL, which contains all of my code, and the second is ZLIB.DLL, which contains the un-compression code for reading zip files.
These two DLLs must be installed in your path, the PATH variable. Suppose I unpack the distribution in the location G:\DIR, creating the file G:\DIR\JAVATIME. Set PATH like so: