Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Java .class File Format 陳正佳 1 Java Virtual Machine • the cornerstone of Sun's Java programming language. • a component of the Java technology responsible for A. Java's cross-platform delivery, B. the small size of its compiled code, C. Java's ability to protect users from malicious programs. • JVM knows nothing of the Java programming language, only of a particular file format, the class file format. 2 Class file format A class file contains » 1. Java Virtual Machine instructions (bytecodes) » 2. a symbol table (Constant pool) » 3. other ancillary information. Javap options » 1. -c bytecode » 2. -s internal type signatures » 3. -verbose stack size and number of local variables and args. 2 3 Class file format Each class file contains one Java type, either a class or an interface. A class file consists of a stream of 8-bit bytes. All 16-bit, 32-bit, and 64-bit quantities are constructed by reading in two, four, and eight consecutive 8-bit bytes, respectively. 2 4 Class File Format Multibyte data items are always stored in bigendian order, where the high bytes come first. u1, u2,and u4 represent an unsigned one-, two-, or four-byte quantity, respectively. These types may be read by methods such as readUnsignedByte, readUnsignedShort, and readInt of the interface java.io.DataInput. 3 5 class SumI { public static void main (String[] args) { int count=10; int sum =0; for (int index=1;index<count;index++) sum=sum+index; System.out.println("Sum="+sum); } // method main } 6 Class File Structure ClassFile { u4 magic; u2 minor_version; u2 major_version; u2 constant_pool_count; cp_info constant_pool[constant_pool_count-1]; u2 access_flags; u2 this_class; u2 super_class; u2 interfaces_count; u2 interfaces[interfaces_count]; u2 fields_count; field_info fields[fields_count]; u2 methods_count; method_info methods[methods_count]; u2 attributes_count; attribute_info attributes[attributes_count]; } 4 7 Magic magic :u4 The magic item supplies the magic number identifying the class file format; it has the value 0xCAFEBABE. 5 8 Version minor_version :u2, major_version:u2 The values of the minor_version and major_version items are the minor and major version numbers of the compiler that produced this class file. 5 9 Constant Pool constant_pool_count :u2 » must > 0. » indicates the number of entries in the constant_pool table of the class file, » where the constant_pool entry at index zero is included in the count but is not present in the constant_pool table of the class file. » i.e., if count = 13 => pool[1] … pool[12] 5 10 Constant Pool constant_pool[] » a table of variable-length structures » representing various – numeric literals – string constants, – class/interface/type names, – field reference, – method reference and – other constants that are referred to within the ClassFile structure and its substructures. 5 11 Constant Pool constant_pool[0], » reserved for internal use by a JVM implementation. That entry is not present in the class file. The first entry in the class file is constant_pool[1]. Each constant_pool[i] is a variable-length structure whose format is indicated by its first "tag" byte. 5 12 Access Flags access_flags :u2 » a mask of modifiers used with class and interface declarations. 5 13 Access Flags The access_flags modifiers Flag Name ACC_PUBLIC Value 0x0001 ACC_FINAL ACC_SUPER 0x0010 0x0020 ACC_INTERFACE 0x0200 ACC_ABSTRACT 0x0400 Meaning Is public; Used By Class, interface Is final; Class superclass Class, =1 for new JVM interface interface. Interface Is abstract; Class, interface 5 14 Access Flags The access_flags modifiers Flag Name ACC_PUBLIC Value 0x0001 ACC_FINAL ACC_SUPER 0x0010 0x0020 ACC_INTERFACE 0x0200 ACC_ABSTRACT 0x0400 Meaning Is public; Used By Class, interface Is final; Class superclass Class, =1 for new JVM interface interface. Interface Is abstract; Class, interface 5 15 Access Flags (continued) The access_flags modifiers Flag Name Value ACC_SYNTHETIC 0x1000 Meaning Used By synthetic ; Not present in the source code. ACC_ANNOTATION 0x2000 annotation type. ACC_ENUM 0x4000 enum type. 5 16 this class this_class :u2 » a valid index into the constant_pool table. » The entry at that index must be a CONSTANT_Class_info structure representing the class or interface defined by this class file. 5 17 super_class super_class » For a class, the value of the super_class item either must be zero or must be a valid index into the constant_pool table. » If the value of the super_class item is nonzero, the constant_pool entry at that index must be a CONSTANT_Class_info structure representing the superclass of the class defined by this class file. 5 18 super_class super_class » Neither the superclass nor any of its superclasses may be a final class. » If the value of super_class is zero, then this class file must represent the class java.lang.Object, » the only class or interface without a superclass. 5 19 interfaces interfaces_count » the number of direct superinterfaces of this class or interface type. interfaces[] » Each value in the interfaces array must be a valid index into the constant_pool table. » The constant_pool entry at each value of interfaces[i], where i < interfaces_count, must be a CONSTANT_Class_info structure 5 20 Fields fields_count » gives the number of field_info structures in the fields table. fields[] » Each entry a variable-length field_info structure giving a complete description of a field in the class or interface type. » includes only those fields that are declared by this class or interface. » does not include fields inherited from superclasses or superinterfaces. 5 21 Methods methods_count » gives the number of method_info structures in the methods table. methods[] » Each entry a variable-length method_info structure giving a complete description of Java Virtual Machine code for a method in the class or interface. » The method_info structures represent all methods, both instance methods, class (static) methods, and constructor methods declared by this class or interface type. 5 22 Methods methods[] » includes only those methods explicitly declared by this class. » Interfaces have only the single method <clinit>, the interface initialization method. » Constructor methods have the a common name <init>. » does not include items representing methods that are inherited from superclasses or superinterfaces. 5 23 Class Attributes attributes_count » the number of attributes in the attributes table of this class attributes[] » Each value of the attributes table must be a variable-length attribute structure. » A ClassFile structure can have any number of attributes associated with it. 5 24 Internal form of fully qualified names Replace all dots with /. EX: a.b.C ==> a/b/C 25 Descriptor A descriptor is a string representing the type of a field or method. Descriptors are represented in the class file format using UTF-8 strings. Grammar: FieldType ::= BaseType | ObjectType | ArrayType 26 Fields Descriptors BaseType ::= B | C | D | F | I | J | S | Z » B for byte; Z for boolean » C D F I J S for char, double, float, int, long and short, respectively. ObjectType ::= Lclassname; ArrayType ::= [ComponentType Ex: » int[][] ==> [[I » Object[] ==> [java/lang/Object; 27 Field and method descriptors FieldDescriptor ::= FieldType ComponentType ::= FieldType MethodDescriptor ::= ( ParameterDescriptor* ) ReturnDescriptor ParameterDescriptor ::= FieldDescriptor ReturnDescriotor ::= FieldDescriptor | V » V for void method Ex: 1. Object mymethod(int i, double d, Thread t) ==> (IDLjava/lang/Thread;)Ljava/lang/Object; 2. void com.Clazz.m(int i) ==> (I)V. 28 The Constant Pool JVM instructions do not rely on the runtime layout of classes, interfaces, class instances, or arrays. Instead, instructions refer to symbolic information in the constant_pool table. All constant_pool table entries have the following general format: cp_info { u1 tag; u1 info[]; } 29 Constant pool Tags Constant Type Value CONSTANT_Utf8 1 CONSTANT_Integer 3 CONSTANT_Float 4 CONSTANT_Long 5 CONSTANT_Double 6 CONSTANT_Class 7 CONSTANT_String 8 CONSTANT_Fieldref 9 CONSTANT_Methodref 10 CONSTANT_InterfaceMethodref 11 CONSTANT_NameAndType 12 30 The CONSTANT_Utf8_info Structure Used to represent constant string values. utf-8 encoding : 1. 1~127 (xxx xxxx) ==> 1 bytes : 0xxx xxxx 2. 0, \u0080~\u07ff ( xxx xxxx xxxx) ==> 2 bytes: 110x xxxx 10xx xxxx 0 => 1100 0000 1000 0000 3. \u0800 ~ \uffff (xxxx xxxx xxxx xxxx) ==> 3 bytes: 1110 xxxx 10xx xxxx 10xx xxxx 31 Format CONSTANT_Utf8_info { u1 tag; // = 1 u2 length; u1 bytes[length]; } Notes: tag = 1. length : number of bytes in the encoded utf8 string. bytes[] bytes[] : contains the bytes of the string. No byte may have the value (byte)0 or lie in the range (byte)0xf0-(byte)0xff. 32 The CONSTANT_String_info Structure CONSTANT_String_info { u1 tag; // = 8 u2 string_index; } string_index : » a valid index into a CONSTANT_Utf8_info 33 CONSTANT_Integer_info and CONSTANT_Float_info CONSTANT_Integer_info { u1 tag; // = 3 u4 bytes; } CONSTANT_Float_info { u1 tag; // =4 u4 bytes; } 34 CONSTANT_Long_info and CONSTANT_Double_info CONSTANT_Long_info { u1 tag; // = 5 u8 bytes; } CONSTANT_Double_info { u1 tag; // =6 u8 bytes; } 35 CONSTANT_Class_info Used to represent a class or an interface: CONSTANT_Class_info { u1 tag; // = 7 u2 name_index; } name_index is an index into an utf-8 entry encoding a fully qualified class name. include also array type: [[I, [Lcom/Cl;, etc. 36 CONSTANT_NameAndType_info used to represent a field or method, without indicating which class or interface type it belongs to: CONSTANT_NameAndType_info { u1 tag; // = 12 u2 name_index; // index into utf8 string of a simple name u2 descriptor_index; // index into utf8 string of a filed/method descriptor } 37 Constant fieldref, methodRef and interfaceMethodRef info CONSTANT_Fieldref_info { u1 tag; // = 9 u2 class_index; // index into a class_info entry u2 name_and_type_index; } CONSTANT_Methodref_info { u1 tag; // = 10 u2 class_index; u2 name_and_type_index; } CONSTANT_InterfaceMethodref_info { u1 tag; //11 u2 class_index; u2 name_and_type_index; } 38 Fields Each field is described by a field_info structure. No two fields in one class file may have the same name and descriptor. Format: field_info { » u2 access_flags; » u2 name_index; // index to utf8 entry for name » u2 descriptor_index;// index to uft8 entry for field type » u2 attributes_count; » attribute_info attributes[attributes_count]; } 39 Methods Each method, including each instance initialization method(<init>) and the class or interface initialization method(<clinit>) , is described by a method_info structure. Format: method_info { u2 access_flags; u2 name_index; // index into utf8 entry for name u2 descriptor_index; // index into utf8 entry u2 attributes_count; attribute_info attributes[attributes_count]; } 40 Attributes Attributes are used in the ClassFile, field_info, method_info, and Code_attribute structures of the class file format. General Format: Attribute_info { u2 attribute_name_index; // into utf8 u4 attribute_length; // excluding init 6 bytes u1 info[attribute_length]; } 41 Types of attributes class attributes: » synthetic, deprecated, sourceFile » innerClasses field attributes: » constantValue, deprecated method attributes: » code attributes, deprecated, » Exception attributes code attributes: » lineNumberTable, localVariableTable 42 ConstantValue Attribute ConstantValue_attribute { u2 attribute_name_index; // index to utf8 : “conatantValue” u4 attribute_length; // = 2 u2 constantvalue_index; // into primitive(Integer_entry,…) // or string_entry } 43 Code Attribute a variable-length attribute used in the attribute table of method_info structures. A Code attribute contains » the Java virtual machine instructions and » auxiliary information for a single method, » instance initialization method , or class or interface initialization method. Every JVM implementation must recognize Code attributes. native or abstract, => no this attribute in its method_info structure » O/W=> has exactly one Code attribute. 44 Code_attribute { u2 attribute_name_index; // into utf8: “Code” u4 attribute_length; // exclude initial 6 bytes u2 max_stack;// max # of stack slots needed u2 max_locals; // maximum # of local variables needed // Note: double and long count 2. u4 code_length; u1 code[code_length]; 45 u2 exception_table_length; { u2 start_pc; u2 end_pc; //exclusive u2 handler_pc; u2 catch_type; // index to cp entry f type constant_class_info // zero means called by all exceptions // used to implement finally-clause } exception_table[exception_table_length]; u2 attributes_count; attribute_info attributes[attributes_count]; } 46 The (uncaught checked) Exception Attribute appear in the attribute table of a method_info structure. indicates which checked exceptions a method may throw. 1 Exception attribute in each method_info structure. Format: Exceptions_attribute { u2 attribute_name_index; // into utf8:”Exception” u4 attribute_length; u2 number_of_exceptions; u2 exception_index_table[number_of_exceptions]; // each entry an index to cp of class_info } 47 The InnerClasses Attribute a variable-length attribute in the attributes table of the ClassFile structure. record any inner class/interface referred (or declared) in this class/interface » If the constant pool of a class or interface refers to any class or interface that is not a member of a package, its ClassFile structure must have exactly one InnerClasses attribute in its attributes table. » If a class has members that are classes or interfaces, its constant_pool table (and hence its InnerClasses attribute) must refer to each such member, even if that member is not otherwise mentioned by the class. 48 Format: InnerClasses_attribute { u2 attribute_name_index; // to utf8: “innerClasses u4 attribute_length; u2 number_of_classes; { u2 inner_class_info_index; // into class_info // for each innerclass C in cp u2 outer_class_info_index; // into class_info // containing class of C u2 inner_name_index; // into utf8 for simple // name (no / no $); anonymous => zero u2 inner_class_access_flags; } classes[number_of_classes]; } 49 The Syntheticand Deprecated Attribute A class member that does not appear in the source code must be marked using a Synthetic attribute. Synthetic_attribute { » u2 attribute_name_index; // utf8(“synthetic”) » u4 attribute_length; // = zero } The Deprecated attribute has the following format: Deprecated_attribute { » u2 attribute_name_index; // utf8(“Deprecated”) » u4 attribute_length; // = 0 } 50 The SourceFile Attribute SourceFile_attribute { u2 attribute_name_index; // into utf8(“SourceFile”) u4 attribute_length; // = 2 u2 sourcefile_index; // into utf8 entry } 51 LineNumberTable Attribute LineNumberTable_attribute { » u2 attribute_name_index ; // utf8(“LineNumberTable”) » u4 attribute_length; » u2 line_number_table_length; » { u2 start_pc; » u2 line_number; } line_number_table[line_number_table_length]; » } 52 The LocalVariableTable Attribute LocalVariableTable_attribute { » u2 attribute_name_index; // utf8(“LocalVariableTable”) » u4 attribute_length; » u2 local_variable_table_length; » { u2 start_pc; u2 length; // scope must have » // value in [start_pc, start_pc + length] » u2 name_index; // utf8 for var name » u2 descriptor_index; // uft8 for type » u2 index; // local var index » } local_variable_table[local_variable_table_length]; } 53