Download Lecture 15 - Salisbury University

Document related concepts
no text concepts found
Transcript
CS 432: Compiler Construction
Lecture 15
Department of Computer Science
Salisbury University
Fall 2016
Instructor: Dr. Sophie Wang
http://faculty.salisbury.edu/~xswang
5/1/2017
1
About the Java Technology

The Java Programming Language

The Java Platform


JVM
API
Java Programming Language




All source code is written in text files with .java extension.
Source files are compiled into .class files by javac
compiler.
A .class file contains bytecodes — the machine language
of the Java VM.
java launcher runs application on Java VM.
Compile once, run everywhere

JVM is available on different
operating systems

The same .class files are
capable of running on
Microsoft Windows, the
Solaris TM Operating System
(Solaris OS), Linux, or Mac
OS.

Some virtual machines, (Java
HotSpot VM), at run time

find performance
bottlenecks

recompile (to native code)
frequently used sections
of code
The Java Platform

A platform is the hardware or software
environment in which a program runs.



Microsoft Windows, Linux, Solaris OS, and Mac OS.
Most platforms can be described as a
combination of the operating system and
underlying hardware.
Java platform is a software-only platform that
runs on top of other hardware-based platforms.
Java platform consists of


The Java Virtual Machine
The Java Application Programming Interface
(API)
Abstract Machines
An abstract machine is intended specifically as a runtime
system for a particular (kind of) programming language.
• JVM is a virtual machine for Java programs.
• It directly supports object-oriented concepts such as
classes, objects, methods, method invocation etc
• Advantage: portability
7
Class Files and Class File Format
External representation
(platform independent)
.class files
load
JVM
Internal representation
(implementation dependent)
classes
primitive types
arrays
objects
strings
methods
•
The JVM specification does not give implementation details (can be dependent
on target OS/platform, performance requirements, etc.)
•
The JVM specification defines a machine independent “class file format” that
all JVM implementations must support.
8
Example: out.j
.class public out
.super java/lang/Object
.method public <init>()V
aload_0
invokespecial java/lang/Object/<init>()V
return
.end method
.method public static main([Ljava/lang/String;)V
.limit stack 2
getstatic java/lang/System/out Ljava/io/PrintStream;
ldc “Hello World”
invokevirtual java/io/PrintStream/println(Ljava/lang/String;)V
return
.end method
9
The result: out.class
10
Class File


Table of constants.
Tables describing the class



Tables describing fields and methods



11
name, superclass, interfaces
attributes, constructor
name, type/signature
attributes (private, public, etc)
The code for methods.
Basic Block diagram of JVM
Java Virtual Machine
Your program’s
class files
Class Loader
Java API’s
class files
ByteCodes
Execution Engine
Native method invocation
Host Operating System
12
Architecture of Java Virtual Machine
Class Loader
Subsystem
Class files
Method
Area
heap
Java
Stacks
PC
Registers
Native
Method
Stacks
Run-time Data Areas
Execution
Engine
Native method
Interface
Native
method
library
13
The Class Loader Subsystem

The class loader performs three main functions of JVM,
namely: loading, linking and initialization
Linking
Loading
Initialization
Loading, Linking, Initialization

Loading

Finding the binary form of class or interface type by reading from
disk or over network

Parsing it to get its information, and

Storing the information in the method area.

Linking : “binary form → runtime state form”

Verification : ensuring the correctness of the imported type

Preparation : allocating memory for class variables

Resolution : symbolic reference → direct reference
Initialization

Invoking Java code that initializes class variables

15
Class Loading Process

Loading means reading the class file for a type, parsing it to get its
information, and storing the information in the method area.

For each type it loads, the JVM must store the following information in the
method area:

The fully qualified name of the type

The fully qualified name of the type's direct superclass or if the type is an
interface, a list of its direct super interfaces .

Whether the type is a class or an interface

The type's modifiers ( public, abstract, final, etc)

Constant pool for the type: constants and symbolic references.

Field info : name, type and modifiers of variables (not constants)

Method info: name, return type, number & types of parameters, modifiers,
bytecodes, size of stack frame and exception table.

The end of the loading process is the creation of an instance of java.lang.Class for
the loaded type.

The purpose is to give access to some of the information captured in the method
area for the type, to the programmer.

Some of the methods of the class java.lang.Class are:
public String getName()
public Class getSupClass()
public boolean isInterface()
public Class[] getInterfaces()
public Method[] getMethods()
public Fields[] getFields()
public Constructor[] getConstructors()

Note that for any loaded type T, only one instance of java.lang.Class is created
even if T is used several times in an application.

To use the above methods, we need to first call the getClass() method on any
instance of T to get the reference to the Class instance for T.
The Class Loader Subsystem

The linking process consists of three sub-tasks, namely,
verification, preparation, and resolution
Linking
Loading
Verification
Preparation
Resolution
Initialization
Verification During Linking Process

Verification ensures that binary representation of a class is
structurally correct

The JVM has to make sure that a file it is asked to load was
generated by a valid compiler and it is well formed

Why?

No guarantee that the class file was generated by a Java
compiler

Class B may be a valid sub-class of A at the time A and B
were compiled, but class A may have been changed and recompiled

Enhance runtime performance

Example of some of the things that are checked at verification are:

Every method is provided with a structurally correct signature

Every instruction obeys the type discipline of the Java
language

Every branch instruction branches to the start not middle of
another instruction

There are no operand stack overflows or underflows.
All local variable uses and stores are valid.
The arguments to all the Java Virtual Machine instructions are
of valid types


Verification Process


Pass 1 – when the class file is loaded

The file is properly formatted, and all its data is recognized by the JVM
Pass 2 – when the class file is linked

All checks that do not involve instructions

final classes are not subclassed, final methods are not overridden.

Every class (except Object) has a superclass.

All field references and method references in the constant pool
have valid names, valid classes, and a valid type descriptor.
21

Pass 3 – still during linking

Data-flow analysis on each method . Ensure that at any given point
in the program, no matter what code path is taken to reach that
point:

The operand stack is always the same size and contains the
same types of objects.

No local variable is accessed unless it is known to contain a
value of an appropriate type.

Methods are invoked with the appropriate arguments.

Fields are assigned only using values of appropriate types.

All opcodes have appropriate type arguments on the operand
stack and in the local variables

Pass 4 - the first time a method is actually invoked

a virtual pass whose checking is done by JVM instructions

The referenced method or field exists in the given class.

The currently executing method has access to the referenced
method or field.
22
Preparation

In this phase, the JVM allocates memory for the class (i.e static) variables
and (sets them to default initial values).

Note that class variables are not initialized to their proper initial values until
the initialization phase - no java code is executed until initialization.

The default values for the various types are shown below:
Resolution



Resolution is the process of replacing symbolic names for types, fields and methods
used by a loaded type with their actual references.
Symbolic references are resolved into a direct references by searching through the
method area to locate the referenced entity.
For the class below, at the loading phase, the class loader would have loaded the
classes: TestClassClass, String, System and Object.
public class TestClassClass{
public static void main(String[] args){
String name = new String(“Ahmed”);
Class nameClassInfo = name.getClass();
System.out.println("Parent is: “ + nameClassInfo.getSuperclass());
}
}


The names of these classes would have been stored in the constant pool for
TestClassClass.
In this phase, the names are replaced with their actual references.
25
Class Initialization

This is the process of setting class variables to their proper initial values - initial values
desired by the programmer.
class Example1 {
static double rate = 3.5;
static int size = 3*(int)(Math.random()*5);
...
}
Initialization of a class consists of two steps:

Initializing its direct superclass (if any and if not already initialized)

Executing its own initialization statements

The above imply that, the first class that gets initialized is Object.

Note that static final variables are not treated as class variables but as constants and
are assigned their values at compilation.

class Example2 {
static final int angle = 35;
static final int length = angle * 2;
...
}

After a class is loaded, linked, and initialized, it is ready for use. Its static
fields and static methods can be used and it can be instantiated.

When a new class instance is created, memory is allocated for all its
instance variables in the heap.

Memory is also allocated recursively for all the instance variables
declared in its super class and all classes up is inheritance hierarchy.

All instance variables in the new object and those of its superclasses are
then initialized to their default values.

The constructor invoked in the instantiation is then processed according
to the rules shown on the next page.

Finally, the reference to the newly created object is returned as the result.
Rules for processing a constructor:
1.
Assign the arguments for the constructor to its parameter variables.
2.
If this constructor begins with an explicit invocation of another constructor in
the same class (using this), then evaluate the arguments and process that
constructor invocation recursively.
3.
If this constructor is for a class other than Object, then it will begin with an
explicit or implicit invocation of a superclass constructor (using super).
Evaluate the arguments and process that superclass constructor
invocation recursively.
4.
Initialize the instance variables for this class with their proper values.
5.
Execute the rest of the body of this constructor.
Class Instantiation Example
class GrandFather {
int grandy = 70;
public GrandFather(int grandy) {
this.grandy = grandy;
System.out.println("Grandy: "+grandy);
}
}
class Father extends GrandFather {
int father = 40;
public Father(int grandy, int father) {
super(grandy);
this.father = father;
System.out.println("Grandy: "+grandy+" Father: "+father);
}
}
class Son extends Father {
int son = 10;
public Son(int grandy, int father, int son) {
super(grandy, father);
this.son = son;
System.out.println("Grandy: "+grandy+" Father: "+father+" Son: "+son);
}
}
public class Instantiation {
public static void main(String[] args) {
Son s = new Son(65, 35, 5);
}
}
JVM Memory Model

Stacks (no single stack, since we have threads)





Usually organized as a linked list
Elements: method frames which include

Local Variables Array (LVA)

Operand stack (OS)
Accessible by push/pop instructions.
Garbage collected
Heap




All objects and all arrays
No object is allocated in the stack
Each object is associated with a class stored in the method area
Garbage collected.
30

Method area:

The fully qualified name of the type

The fully qualified name of the type's direct superclass or if the type is
an interface, a list of its direct super interfaces .

Whether the type is a class or an interface

The type's modifiers ( public, abstract, final, etc)

Constant pool for the type: constants and symbolic references.

Field info : name, type and modifiers of variables (not constants)

Method info: name, return type, number & types of parameters,
modifiers, bytecodes, size of stack frame and exception table.

Garbage collected
31
JVM: Runtime Data Areas
Besides OO concepts, JVM also supports multi-threading.
Threads are directly supported by the JVM.
32
Method Area


is analogous to the storage area for compiled code such as ‘data
section’ in UNIX process
Type information

The fully qualified name of the type

The fully qualified name of the type's direct superclass or if the type
is an interface, a list of its direct super interfaces .

Whether the type is a class or an interface

The type's modifiers ( public, abstract, final, etc)

Constant pool for the type: constants and symbolic references.

Field info : name, type and modifiers of variables (not constants)

Method info: name, return type, number & types of parameters,
modifiers, bytecodes, size of stack frame and exception table.

Garbage collected
33
34
Heap




A Java application runs inside its own exclusive JVM
instance.
There is only one heap inside a JVM instance
Whenever new object [class instance] is created
in a running Java application, the memory for the new object
is allocated from a single heap.
 All threads in a Java application share the heap
JVM itself is responsible for deciding whether and when to
free memory occupied by objects that are no longer
referenced by the running application.
 It is known as “garbage collection”
35
PC(Program Counter) Register





Each thread of a running program has its own PC register
PC register is created when the thread is started
One word in size
As a thread executes a Java method, PC register contains
the instruction currently being executed by the thread
If a thread is executing a native method, the value of PC register is
undefined.
36
Java Stacks
• JVM is a stack based machine.
• JVM instructions
• implicitly take arguments from the stack top
• put their result on the top of the stack
• The stack is used to
• pass arguments to methods
• return a result from a method
• store intermediate results while evaluating expressions
• store local variables
37
Stack Frames
• The Java stack consists of frames.
• The JVM specification does not say exactly how the stack
and frames should be implemented.
• A new call frame is created by executing some JVM
instruction for invoking a method
38
Stack frame

has three parts



Local variables section : method’s parameter and local
variables

Type int, float, reference, and returnAddress occupy one
entry

Type long and double occupy two consecutive entries

Type byte, short, and char are converted to int before
being stroed in the local variables
Operand stack : An array of words

JVM use the operand stack as a work space
Frame data

Includes data to support constant pool resolution, normal method
return, and exception dispatch
39
Stack Frames
pointer to
constant pool
args
+
local vars
operand stack
40
Used implicitly when executing JVM
instructions that contain entries into the
constant pool.
Space where the arguments and local variables
of a method are stored. This includes a space for
the receiver (this) at position/offset 0.
Stack for storing intermediate results during the
execution of the method.
• Initially it is empty.
• The maximum depth is known at compile time.
Method parameters on the local variables section
public static int runClassMethod (int i, long l, float f, double d, Object o, byte b) {
return 0;
}
public int runInstanceMethod (char c, double d, short s, boolean b) {
return 0;
}
runClassMethod()
index
type
0
int
1
long
3
4
float
double
parameter
Int i
long l
float f
double d
6
reference
Object o
7
int
byte b
runInstanceMethod()
index
0
type
reference
parameter
hidden this
1
int
2
double
4
int
short s
5
int
boolean b
char c
double d
41
Java Stack
Adding two local variables
iload_0
iload_1
iadd
istore_2
before
starting
local
0 100
variables
1 98
// push the int in local variable 0
// push the int in local variable 1
// pop two ints, add them, push result
// pop int, store into local variable 2
after
iload_0
after
iload_1
after
iadd
100
100
100
100
98
98
98
98
198
2
Operand
stack
after
istore_2
100
100
198
98
42
Instruction-set: typed instructions!
• JVM instructions are explicitly typed: different opCodes for
instructions for integers, floats, arrays and reference types.
• This is reflected by a naming convention in the first letter of
the opCode mnemonics:
• This is the first successful attempt to bring type safety to a
lower level language
iload
lload
fload
dload
aload
43
integer load
long load
float load
double load
reference-type load
Type Checking Strategies





None (e.g., PDP11 assembly)
Compile time only (e.g., C++)
Runtime only (e.g., Smalltalk)
Compile time and runtime (e.g., C#)
Load time: JVM

Rationale:
 No compilation process
 Any hacker can mess with the bytecodes.
44
Loading Constants onto the Operand Stack

Use the instructions ldc and ldc2_w
(load constant and load double-word constant)
to push constant values onto the operand stack.

Examples:
ldc
2
ldc
"Hello, world"
ldc
1.0
ldc2_w 1234567890L
ldc2_w 2.7182818284D
aconst_null ; push null
45
Loading Constants, cont’d


Special shortcuts for loading certain small constants x:
iconst_m1
iconst_x
lconst_x
fconst_x
dconst_x
;
;
;
;
;
Push
Push
Push
Push
Push
int -1
int x, x = 0, 1, 2, 3, 4, or 5
long x, x = 0 or 1
float x, x = 0, 1, or 2
double x, x = 0 or 1
bipush x
sipush x
; Push byte x, -128 <= x <= 127
; Push short x, -32,768 <= x <= 32,767
Shortcut instructions take up less memory
and can execute faster.
_
46
Local Variables

Local variables do not have names in Jasmin.


Fields of a class do have names, which we’ll see later.
Refer to a local variable by its slot number
in the local variables array.

Since each long and double value requires
two consecutive slots, refer to it using the lower slot number.

Examples:
iload
5
lstore 3
; Push the int value in local slot #5
; Pop the long value
; from the top two stack elements
; and store it into local slots #3 and 4
47
Method parameters on the local variables section
public static int runClassMethod (int i, long l, float f, double d, Object o, byte b) {
return 0;
}
public int runInstanceMethod (char c, double d, short s, boolean b) {
return 0;
}
runClassMethod()
index
type
0
int
1
long
3
4
float
double
parameter
Int i
long l
float f
double d
6
reference
Object o
7
int
byte b
runInstanceMethod()
index
0
type
reference
parameter
hidden this
1
int
2
double
4
int
short s
5
int
boolean b
char c
double d
48
Local Variables, cont’d

Do not confuse constant values with slot numbers!



It depends on the instruction.
Examples:
bipush 14 ; push the constant value 14
iload 14 ; push the value in local slot #14
Local variables starting with slot #0 are automatically initialized
to any method arguments.



public static double meth(int k, long m,
float x, String[][] s)
k  local slot #0
m  local slot #1
What happened to slot #2?
x  local slot #3
s  local slot #4
Jasmin method signature:
.method public static meth(IJF[[Ljava/lang/String;)D
49
Load and Store Instructions

In general:
iload
lload
fload
dload
aload

n
n
n
n
n
push
push
push
push
push
the
the
the
the
the
int value in local slot #n
long value in local slot #n
float value in local slot #n
double value in local slot #n
reference in local slot #n
Shortcut examples (for certain small values of n):
iload_0
lload_2
fload_1
dload_3
aload_2

;
;
;
;
;
;
;
;
;
;
push
push
push
push
push
the
the
the
the
the
int value in local slot #0
long value in local slot #2
float value in local slot #1
double value in local slot #3
reference in local slot #2
Store instructions are similar.
50
Instruction-set:
arguments and locals area inside a stack frame
0:
1:
2:
3:
args: indexes 0 .. #args-1
locals: indexes #args .. #args+#locals-1
Instruction examples:
iload_1
istore_1
iload_3
astore_1
aload 5
fstore_3
aload_0
51
• A load instruction: loads something
from the args/locals area to the top
of the operand stack.
• A store instruction takes something
from the top of the operand stack
and stores it in the argument/local
area
Instruction-set: non-local memory access
In the JVM, the contents of different “kinds” of memory can be
accessed by different kinds of instructions.
accessing locals and arguments: load and store
accessing fields in objects: getfield, putfield
accessing static fields: getstatic, putstatic
Note: static fields are a lot like global variables. They are
allocated in the “method area” where also code for methods
and representations for classes are stored.
Q: what memory area are getfield and putfield accessing?
52
Instruction-set: operations on numbers
Arithmetic
add: iadd, ladd, fadd, dadd
subtract: isub, lsub, fsub, dsub
multiply: imul, lmul, fmul, dmul
…
Conversion
i2l, i2f, i2d
l2f, l2d, f2s
f2i, d2i, …
53
Instruction-set …
Operand stack manipulation
pop, dup, swap, …
Control transfer
goto,ifeq,iflt, ifgt, if_icmpeq, if_acmpeq,
ifnull, …
54
Instruction-set …
Method invocation:
Invokevirtual
usual instruction for calling a method on an object.
invokeinterface
but used when the called method is declared in an interface.
(requires different kind of method lookup)
invokespecial
for calling things such as constructors. These are not dynamically
dispatched (also known as invokenonvirtual)
invokestatic
for calling methods that have the “static” modifier (these methods
“belong” to a class, rather an object)
Returning from methods:
return, ireturn, lreturn, areturn, freturn, …
55
Instruction-set: Heap Memory Allocation
Create new class instance (object):
new
Create new array:
newarray
for creating arrays of primitive types.
anewarray
multianewarray
for arrays of reference types
56
Compiling for the Java Virtual Machine
http://docs.oracle.com/javase/specs/jvms/se7/ht
ml/jvms-3.html
57