Download Java Class File Format

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Java .class File Format
陳正佳
1
Java Virtual Machine
• the cornerstone of Sun's Java programming language.
• a component of the Java technology responsible for
A. Java's cross-platform delivery,
B. the small size of its compiled code,
C. Java's ability to protect users from malicious
programs.
• JVM knows nothing of the Java programming language,
only of a particular file format, the class file format.
2
Class file format

A class file contains
» 1. Java Virtual Machine instructions (bytecodes)
» 2. a symbol table (Constant pool)
» 3. other ancillary information.

Javap options
» 1. -c bytecode
» 2. -s internal type signatures
» 3. -verbose stack size and number of local
variables and args.
2
3
Class file format
Each class file contains one Java type,
either a class or an interface.
 A class file consists of a stream of 8-bit
bytes.
 All 16-bit, 32-bit, and 64-bit quantities are
constructed by reading in two, four, and
eight consecutive 8-bit bytes, respectively.

2
4
Class File Format



Multibyte data items are always stored in bigendian order, where the high bytes come first.
u1, u2,and u4 represent an unsigned one-, two-,
or four-byte quantity, respectively.
These types may be read by methods such as
readUnsignedByte, readUnsignedShort, and
readInt of the interface java.io.DataInput.
3
5
class SumI {
public static void main (String[] args) {
int count=10;
int sum =0;
for (int index=1;index<count;index++)
sum=sum+index;
System.out.println("Sum="+sum);
} // method main
}
6
Class File Structure
ClassFile {
u4 magic;
u2 minor_version; u2 major_version;
u2 constant_pool_count;
cp_info constant_pool[constant_pool_count-1];
u2 access_flags;
u2 this_class;
u2 super_class;
u2 interfaces_count; u2 interfaces[interfaces_count];
u2 fields_count; field_info fields[fields_count];
u2 methods_count; method_info methods[methods_count];
u2 attributes_count;
attribute_info attributes[attributes_count];
}
4
7
Magic
magic :u4
The magic item supplies the magic
number identifying the class file format;
it has the value 0xCAFEBABE.
5
8
Version
minor_version :u2, major_version:u2
The values of the minor_version and
major_version items are the minor
and major version numbers of the
compiler that produced this class file.
5
9
Constant Pool
constant_pool_count :u2
» must > 0.
» indicates the number of entries in the
constant_pool table of the class file,
» where the constant_pool entry at index zero
is included in the count but is not present in
the constant_pool table of the class file.
» i.e., if count = 13 => pool[1] … pool[12]
5
10
Constant Pool
constant_pool[]
» a table of variable-length structures
» representing various
– numeric literals
– string constants,
– class/interface/type names,
– field reference,
– method reference and
– other constants that are referred to within the
ClassFile structure and its substructures.
5
11
Constant Pool

constant_pool[0],
» reserved for internal use by a JVM implementation. That
entry is not present in the class file.


The first entry in the class file is constant_pool[1].
Each constant_pool[i] is a variable-length structure
whose format is indicated by its first "tag" byte.
5
12
Access Flags
access_flags :u2
» a mask of modifiers used with class and
interface declarations.
5
13
Access Flags

The access_flags modifiers
Flag Name
ACC_PUBLIC
Value
0x0001
ACC_FINAL
ACC_SUPER
0x0010
0x0020
ACC_INTERFACE 0x0200
ACC_ABSTRACT 0x0400
Meaning
Is public;
Used By
Class,
interface
Is final;
Class
superclass
Class,
=1 for new JVM interface
interface.
Interface
Is abstract;
Class,
interface
5
14
Access Flags

The access_flags modifiers
Flag Name
ACC_PUBLIC
Value
0x0001
ACC_FINAL
ACC_SUPER
0x0010
0x0020
ACC_INTERFACE 0x0200
ACC_ABSTRACT 0x0400
Meaning
Is public;
Used By
Class,
interface
Is final;
Class
superclass
Class,
=1 for new JVM interface
interface.
Interface
Is abstract;
Class,
interface
5
15
Access Flags (continued)

The access_flags modifiers
Flag Name
Value
ACC_SYNTHETIC 0x1000
Meaning
Used By
synthetic
; Not present in the source code.
ACC_ANNOTATION 0x2000
annotation type.
ACC_ENUM
0x4000
enum type.
5
16
this class
this_class :u2
» a valid index into the constant_pool table.
» The entry at that index must be a
CONSTANT_Class_info structure
representing the class or interface defined by
this class file.
5
17
super_class
super_class
» For a class, the value of the super_class
item either must be zero or must be a valid
index into the constant_pool table.
» If the value of the super_class item is
nonzero, the constant_pool entry at that
index must be a CONSTANT_Class_info
structure representing the superclass of the
class defined by this class file.
5
18
super_class
super_class
» Neither the superclass nor any of its
superclasses may be a final class.
» If the value of super_class is zero, then this
class file must represent the class
java.lang.Object,
» the only class or interface without a
superclass.
5
19
interfaces
interfaces_count
» the number of direct superinterfaces of this class or
interface type.
interfaces[]
» Each value in the interfaces array must be a valid
index into the constant_pool table.
» The constant_pool entry at each value of
interfaces[i], where i < interfaces_count, must be
a CONSTANT_Class_info structure
5
20
Fields
fields_count
» gives the number of field_info structures in the fields
table.
fields[]
» Each entry a variable-length field_info
structure giving a complete description of a
field in the class or interface type.
» includes only those fields that are declared
by this class or interface.
» does not include fields inherited from
superclasses or superinterfaces.
5
21
Methods
methods_count
» gives the number of method_info structures in the
methods table.
methods[]
» Each entry a variable-length method_info
structure giving a complete description of Java
Virtual Machine code for a method in the class
or interface.
» The method_info structures represent all
methods, both instance methods, class (static)
methods, and constructor methods declared by
this class or interface type.
5
22
Methods
methods[]
» includes only those methods explicitly declared
by this class.
» Interfaces have only the single method <clinit>,
the interface initialization method.
» Constructor methods have the a common name
<init>.
» does not include items representing methods
that are inherited from superclasses or
superinterfaces.
5
23
Class Attributes
attributes_count
» the number of attributes in the attributes
table of this class
attributes[]
» Each value of the attributes table must be a
variable-length attribute structure.
» A ClassFile structure can have any number
of attributes associated with it.
5
24
Internal form of fully qualified
names

Replace all dots with /.
EX:
a.b.C ==> a/b/C
25
Descriptor
A descriptor is a string representing the
type of a field or method.
 Descriptors are represented in the class file
format using UTF-8 strings.
 Grammar:
FieldType ::=
BaseType
|
ObjectType
|
ArrayType

26
Fields Descriptors

BaseType ::= B | C | D | F | I | J | S | Z
» B for byte; Z for boolean
» C D F I J S for char, double, float, int, long
and short, respectively.
ObjectType ::= Lclassname;
 ArrayType ::= [ComponentType
 Ex:

» int[][] ==> [[I
» Object[] ==> [java/lang/Object;
27
Field and method descriptors
FieldDescriptor ::= FieldType
 ComponentType ::= FieldType
 MethodDescriptor ::=
( ParameterDescriptor* ) ReturnDescriptor
 ParameterDescriptor ::= FieldDescriptor
 ReturnDescriotor ::= FieldDescriptor | V

» V for void method

Ex:
1. Object mymethod(int i, double d, Thread t)
==> (IDLjava/lang/Thread;)Ljava/lang/Object;
2. void com.Clazz.m(int i) ==> (I)V.
28
The Constant Pool
JVM instructions do not rely on the runtime
layout of classes, interfaces, class
instances, or arrays.
 Instead, instructions refer to symbolic
information in the constant_pool table.
 All constant_pool table entries have the
following general format:
cp_info { u1 tag;
u1 info[]; }

29
Constant pool Tags











Constant Type
Value
CONSTANT_Utf8
1
CONSTANT_Integer
3
CONSTANT_Float
4
CONSTANT_Long
5
CONSTANT_Double
6
CONSTANT_Class
7
CONSTANT_String
8
CONSTANT_Fieldref
9
CONSTANT_Methodref
10
CONSTANT_InterfaceMethodref 11
CONSTANT_NameAndType
12
30
The CONSTANT_Utf8_info Structure
Used to represent constant string values.
 utf-8 encoding :
1. 1~127 (xxx xxxx)
==> 1 bytes : 0xxx xxxx
2. 0, \u0080~\u07ff ( xxx xxxx xxxx)
==> 2 bytes: 110x xxxx 10xx xxxx
0 => 1100 0000 1000 0000
3. \u0800 ~ \uffff (xxxx xxxx xxxx xxxx)
==> 3 bytes: 1110 xxxx 10xx xxxx 10xx xxxx

31
Format
CONSTANT_Utf8_info {
u1 tag;
// = 1
u2 length;
u1 bytes[length]; }
Notes:
 tag = 1.
 length : number of bytes in the encoded utf8 string.
bytes[]
 bytes[] : contains the bytes of the string.
 No byte may have the value (byte)0 or lie in the range
(byte)0xf0-(byte)0xff.
32
The CONSTANT_String_info
Structure
CONSTANT_String_info {
u1 tag; // = 8
u2 string_index; }

string_index :
» a valid index into a CONSTANT_Utf8_info
33
CONSTANT_Integer_info and
CONSTANT_Float_info


CONSTANT_Integer_info {
u1 tag; // = 3
u4 bytes; }
CONSTANT_Float_info {
u1 tag; // =4
u4 bytes; }
34
CONSTANT_Long_info and
CONSTANT_Double_info


CONSTANT_Long_info {
u1 tag; // = 5
u8 bytes; }
CONSTANT_Double_info {
u1 tag; // =6
u8 bytes; }
35
CONSTANT_Class_info

Used to represent a class or an interface:
CONSTANT_Class_info {
u1 tag;
// = 7
u2 name_index; }
 name_index is an index into an utf-8 entry
encoding a fully qualified class name.
 include also array type: [[I, [Lcom/Cl;, etc.

36
CONSTANT_NameAndType_info

used to represent a field or method, without
indicating which class or interface type it
belongs to:
CONSTANT_NameAndType_info {
u1 tag;
// = 12
u2 name_index; // index into utf8 string of a
simple name
u2 descriptor_index; // index into utf8 string of a
filed/method descriptor
}
37
Constant fieldref, methodRef
and interfaceMethodRef info



CONSTANT_Fieldref_info { u1 tag; // = 9
u2 class_index; // index into a class_info entry
u2 name_and_type_index; }
CONSTANT_Methodref_info { u1 tag; // = 10
u2 class_index;
u2 name_and_type_index; }
CONSTANT_InterfaceMethodref_info { u1 tag; //11
u2 class_index;
u2 name_and_type_index; }
38
Fields




Each field is described by a field_info structure.
No two fields in one class file may have the same
name and descriptor.
Format:
field_info {
» u2 access_flags;
» u2 name_index; // index to utf8 entry for name
» u2 descriptor_index;// index to uft8 entry for field
type
» u2 attributes_count;
» attribute_info attributes[attributes_count]; }
39
Methods
Each method, including each instance
initialization method(<init>) and the class or
interface initialization method(<clinit>) , is
described by a method_info structure.
 Format:
method_info {
u2 access_flags;
u2 name_index; // index into utf8 entry for name
u2 descriptor_index; // index into utf8 entry
u2 attributes_count;
attribute_info attributes[attributes_count]; }

40
Attributes
Attributes are used in the ClassFile, field_info,
method_info, and Code_attribute structures of
the class file format.
 General Format:
Attribute_info {
u2 attribute_name_index;
// into utf8
u4 attribute_length; // excluding init 6 bytes
u1 info[attribute_length];
}

41
Types of attributes

class attributes:
» synthetic, deprecated, sourceFile
» innerClasses

field attributes:
» constantValue, deprecated

method attributes:
» code attributes, deprecated,
» Exception attributes

code attributes:
» lineNumberTable, localVariableTable
42
ConstantValue Attribute

ConstantValue_attribute {
u2 attribute_name_index; // index to utf8 :
“conatantValue”
u4 attribute_length; // = 2
u2 constantvalue_index;
// into primitive(Integer_entry,…)
// or string_entry
}
43
Code Attribute


a variable-length attribute used in the attribute
table of method_info structures.
A Code attribute contains
» the Java virtual machine instructions and
» auxiliary information for a single method,
» instance initialization method , or class or interface
initialization method.


Every JVM implementation must recognize Code
attributes.
native or abstract, => no this attribute in its
method_info structure
» O/W=> has exactly one Code attribute.
44
Code_attribute {
u2 attribute_name_index;
// into utf8: “Code”
u4 attribute_length; // exclude initial 6 bytes
u2 max_stack;// max # of stack slots needed
u2 max_locals;
// maximum # of local variables needed
// Note: double and long count 2.
u4 code_length;
u1 code[code_length];
45
u2 exception_table_length;
{ u2 start_pc;
u2 end_pc; //exclusive
u2 handler_pc;
u2 catch_type;
// index to cp entry f type constant_class_info
// zero means called by all exceptions
// used to implement finally-clause
} exception_table[exception_table_length];
u2 attributes_count;
attribute_info attributes[attributes_count]; }
46
The (uncaught checked)
Exception Attribute




appear in the attribute table of a method_info
structure.
indicates which checked exceptions a method may
throw.
 1 Exception attribute in each method_info
structure.
Format:
Exceptions_attribute {
u2 attribute_name_index; // into utf8:”Exception”
u4 attribute_length; u2 number_of_exceptions;
u2 exception_index_table[number_of_exceptions];
// each entry an index to cp of class_info }
47
The InnerClasses Attribute


a variable-length attribute in the attributes table of
the ClassFile structure.
record any inner class/interface referred (or
declared) in this class/interface
» If the constant pool of a class or interface refers to any
class or interface that is not a member of a package,
its ClassFile structure must have exactly one
InnerClasses attribute in its attributes table.
» If a class has members that are classes or interfaces,
its constant_pool table (and hence its InnerClasses
attribute) must refer to each such member, even if that
member is not otherwise mentioned by the class.
48
Format:
InnerClasses_attribute {
u2 attribute_name_index; // to utf8: “innerClasses
u4 attribute_length; u2 number_of_classes;
{ u2 inner_class_info_index; // into class_info
// for each innerclass C in cp
u2 outer_class_info_index; // into class_info
// containing class of C
u2 inner_name_index; // into utf8 for simple
// name (no / no $); anonymous => zero
u2 inner_class_access_flags; }
classes[number_of_classes]; }
49
The Syntheticand Deprecated
Attribute
A class member that does not appear in the
source code must be marked using a Synthetic
attribute.
Synthetic_attribute {

» u2 attribute_name_index; // utf8(“synthetic”)
» u4 attribute_length; // = zero
}
The Deprecated attribute has the following
format:
Deprecated_attribute {

» u2 attribute_name_index; // utf8(“Deprecated”)
» u4 attribute_length; // = 0 }
50
The SourceFile Attribute
SourceFile_attribute {
u2 attribute_name_index;
// into utf8(“SourceFile”)
u4 attribute_length; // = 2
u2 sourcefile_index; // into utf8 entry
}
51
LineNumberTable Attribute

LineNumberTable_attribute {
» u2 attribute_name_index ; //
utf8(“LineNumberTable”)
» u4 attribute_length;
» u2 line_number_table_length;
» { u2 start_pc;
» u2 line_number;
}
line_number_table[line_number_table_length];
»
}
52
The LocalVariableTable Attribute

LocalVariableTable_attribute {
» u2 attribute_name_index; //
utf8(“LocalVariableTable”)
» u4 attribute_length;
» u2 local_variable_table_length;
» { u2 start_pc;
u2 length; // scope must have
»
// value in [start_pc, start_pc + length]
» u2 name_index; // utf8 for var name
» u2 descriptor_index; // uft8 for type
» u2 index; // local var index
»
}
local_variable_table[local_variable_table_length];
}
53