Download View presentation slides ()

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Slide 1
Decompiling Java Using Staged Encapsulation
Welcome!
Authors:
1. Jerome Miecznikowski (decompiler writer)
2. Laurie Hendren (decompiler writer’s supervisor)
What is Dava?
Dava is our Java decompiler.
The focus is to reclaim a Java control flow structure from
java bytecode.
Slide 2
Decompiling Java Using Staged Encapsulation
Overview of Presentation
1.
2.
3.
4.
5.
6.
7.
General Goals.
Why restructuring bytecode to Java is
challenging.
Overview of Restructuring algorithm.
Walk through Example.
Advanced issues.
Testing and Results.
Conclusions and future work.
Slide 3
Decompiling Java using Staged Encapsulation
General Goals.
Handle any and all sources of bytecode.
1.
1.
Output should look natural.
2.



“At all costs” restructuring.
If a restructuring is possible, there are many correct
restructurings.
Use low number of control-flow grammar
productions.
Good runtime performance. (Results are near
linear)
Slide 4
Decompiling Java using Staged Encapsulation
Why restructuring to Java is interesting.
Java’s interesting properties:

1.
2.
3.
4.
No gotos, so there’s no easy fall-back solution.
Exceptions.
Multi-level breaks and continues.
Labeled control flow statements, and labeled blocks
label_0:
while (a()) {
while (b()) {
if (c())
break label_0;
}
d(); // missed by break
}
Slide 5
Decompiling Java using Staged Encapsulation
Why restructuring to Java is interesting.
Java’s interesting properties:





No gotos, so there’s no easy fall-back solution.
Exceptions.
Multi-level breaks and continues.
Labeled control flow statements, and labeled blocks
label_0:
while (a()) {
while (b()) {
if (c())
break label_0;
}
d(); // missed by break
}
label_0:
{
while (b()) {
if (c())
break label_0;
}
d(); // missed again
}
Slide 6
Decompiling Java using Staged Encapsulation
Overview of Restructuring algorithm.
CLASS
GRIMP
CFG
Method int exampleMethod(int, int)
0 goto 28
3 iload_1
4 iload_2
5 idiv
6 iconst_2
7 if_icmpne 13
10 goto 41
13 iinc 1 1
16 goto 28
19 astore_3
20 getstatic #3 <Field java.io.PrintStream out>
23 ldc #4 <String "div by 0">
25 invokevirtual #5 <Method void println(java.lang.String)>
28 iload_1
29 bipush 10
31 if_icmplt 3
34 iinc 2 -1
37 iload_2
38 ifgt 28
41 iload_1
42 ireturn
Exception table:
from
to target type
3
16
19
<Class java.lang.ArithmeticException>
SET
Original Java class file.
JAVA
Slide 7
Decompiling Java using Staged Encapsulation
Overview of Restructuring algorithm.
CLASS
GRIMP
public int exampleMethod(int int )
{
exampleClass r0;
int i0, i1;
java.lang.ArithmeticException r1, $r2;
CFG
Resulting Grimp.
r0 := @this;
i0 := @parameter0;
i1 := @parameter1;
goto label4;
label0:
if i0 / i1 != 2 goto label1;
goto label5;
label1:
i0 = i0 + 1;
label2:
goto label4;
label3:
$r2 := @caughtexception;
r1 = $r2;
java.lang.System.out.println("div by 0");
label4:
if i0 < 10 goto label0;
i1 = i1 + -1;
if i1 > 0 goto label4;
label5:
return i0;
catch java.lang.ArithmeticException from label0 to label2 with label3;
}
SET
JAVA
Slide 8
Decompiling Java using Staged Encapsulation
Overview of Restructuring algorithm.
CLASS
GRIMP
public int exampleMethod(int int )
{
exampleClass r0;
int i0, i1;
java.lang.ArithmeticException r1, $r2;
a:
b:
c:
d:
e:
f:
g:
h:
i:
j:
k:
l:
CFG
SET
Resulting CFG.
a
r0 := @this;
i0 := @parameter0;
i1 := @parameter1;
goto label4;
label0:
if i0 / i1 != 2 goto label1;
goto label5;
label1:
i0 = i0 + 1;
label2:
goto label4;
label3:
$r2 := @caughtexception;
r1 = $r2;
java.lang.System.out.println("div by 0");
label4:
if i0 < 10 goto label0;
i1 = i1 + -1;
if i1 > 0 goto label4;
label5:
return i0;
catch java.lang.ArithmeticException from label0 to label2 with label3;
}
JAVA
b
c
i
j
d
f
e
k
g
l
h
Slide 9
Decompiling Java using Staged Encapsulation
Overview of Restructuring algorithm.
CLASS
GRIMP
CFG
SET
Resulting Structure Encapsulation Tree
Top Node
a
b
c
k
i
d
e
f
g
h
j
JAVA
a
l
b
Stmt Sequence
a
b
c
do-while Statement
k
i
d
e
f
h
j
g
h
Stmt Seq
j
g
h
j
Stmt Seq
f
g
h
k
while Statement
i
d
e
f
try Statement
d
e
f
if Stmt
d
e
Stmt Seq
e
g
Stmt Seq
l
c
i
d
f
e
g
l
h
Slide 10
Decompiling Java using Staged Encapsulation
Overview of Restructuring algorithm.
CLASS
GRIMP
CFG
SET
JAVA
Resulting Java Source
Top Node
a
b
c
Stmt Sequence
a
b
c
k
i
d
e
do-while Statement
k
i
d
e
f
f
while Statement
i
d
e
f
try Statement
d
e
f
if Stmt
d
e
g
j
l
h
j
Stmt Seq
l
g
h
Stmt Seq
j
g
h
g
h
public int exampleMethod(int i0, int i1)
{
label_0:
do
{
while (i0 < 10)
{
try
{
if (i0 / i1 != 2)
{
i0 = i0 + 1;
}
else
{
break label_0;
}
}
catch (ArithmeticException e)
{
System.out.println("div by 0");
}
}
Stmt Seq
f
g
h
i1 = i1 + -1;
}
while (i1 > 0);
return i0;
Stmt Seq
e
}
Slide 11
Decompiling Java using Staged Encapsulation
Walk through Example.
Cycles
if & switch
a
Top Node
a
b
b
c
i
j
d
f
e
k
g
l
h
Exceptions
c
k
i
Stmt Seq.
d
e
f
breaks
g
h
j
l
Slide 12
Decompiling Java using Staged Encapsulation
Walk through Example.
Cycles
if & switch
Exceptions
Stmt Seq.
breaks
Control Flow Graph - Informal Definitions:
1. Dominators. If for every path from the start of the program,
you have to pass through “A” to get to “B” then
“A” dominates “B”
(Note that dominance is transitive.)
2. Strongly Connected Component. A set of nodes in the
control flow graph such that every nodes is reachable from
every other node. Multiple hops to reach a node are allowed.
Slide 13
Decompiling Java using Staged Encapsulation
Walk through Example.
Cycles
if & switch
a
Top Node
a
b
b
c
i
j
d
f
e
k
g
l
h
Exceptions
c
k
i
Stmt Seq.
d
e
f
breaks
g
h
j
l
Slide 14
Decompiling Java using Staged Encapsulation
Walk through Example.
Cycles
if & switch
a
Top Node
a
b
Exceptions
c
Stmt Seq.
breaks
k
i
d
e
f
g
h
j
k
i
d
e
f
g
h
j
b
c
i
j
d
f
e
k
g
l
h
l
Slide 15
Decompiling Java using Staged Encapsulation
Walk through Example.
Cycles
if & switch
a
Top Node
a
b
b
c
i
j
d
f
e
k
g
l
h
Exceptions
c
k
i
Stmt Seq.
d
breaks
e
f
g
h
j
do-while Statement
k
i
d
e
f
g
h
j
l
Slide 16
Decompiling Java using Staged Encapsulation
Walk through Example.
Cycles
if & switch
a
Top Node
a
b
b
c
i
j
d
f
e
k
g
l
h
Exceptions
c
k
i
Stmt Seq.
d
breaks
e
f
g
h
j
do-while Statement
k
i
d
e
f
g
h
j
l
Slide 17
Decompiling Java using Staged Encapsulation
Walk through Example.
Cycles
if & switch
a
Top Node
a
b
b
Exceptions
c
k
i
Stmt Seq.
d
breaks
e
f
g
h
j
do-while Statement
k
i
d
e
f
g
h
j
f
g
h
c
i
i
j
d
f
e
k
g
l
h
d
e
l
Slide 18
Decompiling Java using Staged Encapsulation
Walk through Example.
Cycles
if & switch
a
Top Node
a
b
b
Exceptions
c
k
i
Stmt Seq.
d
breaks
e
f
g
h
j
do-while Statement
k
i
d
e
f
g
h
j
while Statement
i
d
e
f
g
h
c
i
j
d
f
e
k
g
l
h
l
Slide 19
Decompiling Java using Staged Encapsulation
Walk through Example.
Cycles
if & switch
a
Top Node
a
b
b
Exceptions
c
k
i
Stmt Seq.
d
breaks
e
f
g
h
j
do-while Statement
k
i
d
e
f
g
h
j
while Statement
i
d
e
f
g
h
c
i
j
d
f
e
k
g
l
h
l
Slide 20
Decompiling Java using Staged Encapsulation
Walk through Example.
Cycles
if & switch
a
Top Node
a
b
b
Exceptions
c
k
i
Stmt Seq.
d
breaks
e
f
g
h
j
do-while Statement
k
i
d
e
f
g
h
j
while Statement
i
d
e
f
g
h
c
i
j
d
f
e
k
g
l
h
l
Slide 21
Decompiling Java using Staged Encapsulation
Walk through Example.
Cycles
if & switch
a
Top Node
a
b
b
Exceptions
c
k
i
Stmt Seq.
d
breaks
e
f
g
h
j
do-while Statement
k
i
d
e
f
g
h
j
while Statement
i
d
e
f
g
h
c
i
j
d
f
e
k
g
l
h
if Stmt
d
e
l
Slide 22
Decompiling Java using Staged Encapsulation
Walk through Example.
Cycles
if & switch
a
Top Node
a
b
b
Exceptions
c
k
i
Stmt Seq.
d
breaks
e
f
g
h
j
do-while Statement
k
i
d
e
f
g
h
j
while Statement
i
d
e
f
g
h
c
i
j
d
f
e
k
g
l
h
if Stmt
d
e
l
Slide 23
Decompiling Java using Staged Encapsulation
Walk through Example.
Cycles
if & switch
a
Top Node
a
b
b
Exceptions
c
k
i
Stmt Seq.
d
breaks
e
f
g
h
j
do-while Statement
k
i
d
e
f
g
h
j
while Statement
i
d
e
f
g
h
c
i
j
d
f
e
k
g
l
h
if Stmt
d
e
l
Slide 24
Decompiling Java using Staged Encapsulation
Walk through Example.
Cycles
if & switch
a
Top Node
a
b
b
Exceptions
c
k
i
Stmt Seq.
d
breaks
e
f
g
h
j
do-while Statement
k
i
d
e
f
g
h
j
while Statement
i
d
e
f
g
h
c
i
j
d
f
e
k
g
l
h
if Stmt
d
e
l
Slide 25
Decompiling Java using Staged Encapsulation
Walk through Example.
Cycles
if & switch
a
Top Node
a
b
b
Exceptions
c
k
i
Stmt Seq.
d
breaks
e
f
g
h
j
do-while Statement
k
i
d
e
f
g
h
j
while Statement
i
d
e
f
g
h
c
i
j
d
f
e
k
g
l
h
if Stmt
d
e
l
Slide 26
Decompiling Java using Staged Encapsulation
Walk through Example.
Cycles
if & switch
a
Top Node
a
b
b
Exceptions
c
k
i
Stmt Seq.
d
breaks
e
f
g
h
j
do-while Statement
k
i
d
e
f
g
h
j
while Statement
i
d
e
f
g
h
g
h
c
i
j
d
e
k
f
try Statement
d
e
f
g
if Stmt
d
e
l
h
l
Slide 27
Decompiling Java using Staged Encapsulation
Walk through Example.
Cycles
if & switch
a
Top Node
a
b
b
Exceptions
c
k
i
Stmt Seq.
d
breaks
e
f
g
h
j
do-while Statement
k
i
d
e
f
g
h
j
while Statement
i
d
e
f
g
h
g
h
c
i
j
d
e
k
f
try Statement
d
e
f
g
if Stmt
d
e
l
h
l
Slide 28
Decompiling Java using Staged Encapsulation
Walk through Example.
Cycles
if & switch
a
Top Node
a
b
b
Exceptions
c
Stmt Sequence
a
b
c
k
i
Stmt Seq.
d
e
f
do-while Statement
k
i
d
e
f
breaks
g
j
l
h
j
Stmt Seq
l
g
h
Stmt Seq
j
g
h
g
h
c
while Statement
i
d
e
f
i
j
d
e
k
f
try Statement
d
e
f
g
if Stmt
d
e
l
h
Stmt Seq
f
g
h
Stmt Seq
e
Slide 29
Decompiling Java using Staged Encapsulation
Walk through Example.
Cycles
if & switch
a
Top Node
a
b
b
Exceptions
c
Stmt Sequence
a
b
c
k
i
Stmt Seq.
d
e
f
do-while Statement
k
i
d
e
f
breaks
g
j
l
h
j
Stmt Seq
l
g
h
Stmt Seq
j
g
h
g
h
c
while Statement
i
d
e
f
i
j
d
e
k
f
try Statement
d
e
f
g
if Stmt
d
e
l
h
Stmt Seq
f
g
h
Stmt Seq
e
Slide 30
Decompiling Java using Staged Encapsulation
Walk through Example.
Cycles
if & switch
a
Top Node
a
b
b
Exceptions
c
Stmt Sequence
a
b
c
k
i
Stmt Seq.
d
e
f
do-while Statement
k
i
d
e
f
breaks
g
j
l
h
j
Stmt Seq
l
g
h
Stmt Seq
j
g
h
g
h
c
while Statement
i
d
e
f
i
j
d
e
k
f
try Statement
d
e
f
g
if Stmt
d
e
l
h
Stmt Seq
f
g
h
Stmt Seq
e
Slide 31
Decompiling Java using Staged Encapsulation
Walk through Example.
Cycles
if & switch
public int exampleMethod(int i0, int i1)
{
label_0:
do
{
while (i0 < 10)
{
try
{
if (i0 / i1 != 2)
{
i0 = i0 + 1;
}
else
{
break label_0;
}
}
catch (ArithmeticException e)
{
System.out.println("div by 0");
}
}
i1 = i1 + -1;
}
while (i1 > 0);
return i0;
}
Top Node
a
b
Exceptions
c
Stmt Sequence
a
b
c
k
i
Stmt Seq.
d
e
f
do-while Statement
k
i
d
e
f
while Statement
i
d
e
f
try Statement
d
e
f
if Stmt
d
e
breaks
g
j
l
h
j
Stmt Seq
l
g
h
Stmt Seq
j
g
h
g
h
Stmt Seq
f
g
h
Stmt Seq
e
Slide 32
Decompiling Java using Staged Encapsulation
Advanced issues.
1.
2.
3.
4.
5.
Multi-entry point loops
Exceptions
Synchronized blocks
Arbitrary labeled blocks
Others …
Slide 33
Decompiling Java using Staged Encapsulation
Advanced issues.
Multi-entry point loops
a
b
c
d
a
b
c
d
Slide 34
Decompiling Java using Staged Encapsulation
Advanced issues.
Multi-entry point loops
a
a
b
c
d
b
g
c
d
Slide 35
Decompiling Java using Staged Encapsulation
Advanced issues.
Multi-entry point loops
a
b
c
d
a
b
e
f
g
c
e
d
Slide 36
Decompiling Java using Staged Encapsulation
Advanced issues.
Multi-entry point loops
a
c
b
d
f
e
g
Slide 37
Decompiling Java using Staged Encapsulation
Advanced issues.
Multi-entry point loops
a
c
b
d
f
a
e
g
c
b
d
f
e
g
Slide 38
Decompiling Java using Staged Encapsulation
Advanced issues.
Exceptions
1. The “try” block may be non-contiguous.
2. The exception’s SET node may not nest.
Solution: Divide the “try” block into as many parts as
needed, such that every part is contiguous and nests well
in the SET.
Caveat: Every “try” block must have its own unique catch
clause. Therefore, at every divide, the catch clause must
be cloned.
Slide 39
Decompiling Java using Staged Encapsulation
Advanced issues.
Synchronized Blocks
Java provides object locking and monitors with
synchronized() blocks.
To detect, we need a contiguous sub-graph in the CFG that is:
1. Entered by monitorenter, and exited by monitorexit
instuctions.
2. Covered by an exception handler that handles all
exceptions. Further, the exception handler must call
monitorexit and rethrow the original exception.
Slide 40
Decompiling Java using Staged Encapsulation
Advanced issues.
Synchronized Blocks
Sometimes synchronized() blocks cannot represent all
monitorenter and monitorexit instructions.
monitorenter a
monitorenter b
monitorexit a
monitorexit b
Slide 41
Decompiling Java using Staged Encapsulation
Advanced issues.
Synchronized Blocks
Sometimes synchronized() blocks cannot represent all
monitorenter and monitorexit instructions.
monitorenter a
Solution:
monitorenter b
•Create monitor library in Java.
•Replace monitor instructions with method
calls to the monitor library.
monitorexit a
monitorexit b
Slide 42
Decompiling Java using Staged Encapsulation
Advanced issues.
Arbitrary Labeled Blocks
Original SET
A
B
C
E
D
Slide 43
Decompiling Java using Staged Encapsulation
Advanced issues.
Arbitrary Labeled Blocks
Original SET
Modified SET
A
B
A
C
E
D
New labeled block
B
C
E
D
Slide 44
Decompiling Java using Staged Encapsulation
Testing.
Types of Testing
1. Simple stress testing suite.
2. Decompilation and recompilation of small and midsized applications (up to 10,000 lines of code)
• Java sourced applications
• Other sources ...
•
•
•
•
Ada - JGNAT
Eiffel - GNU SmallEiffel
Haskell - D. Wakeling’s Haskell to JVM-code compiler
SML - MLJ
Slide 45
Decompiling Java using Staged Encapsulation
Results.
Stress
Testers
Decompiles
100%
Recompiles
100%
and Runs
Lines /
second*
Comments
Java
SML
Ada
Eiffel
Haskell
100%
97%
100%
100%
93%
100%
100%
94%
100%
93%
-
235
93
78
145
272
Tests each
of the
phases
individually
Occasional
problems
w ith nested
finally
clauses
Occasional
problems
w ith multientry point
loops
* based on tests with a mobile pentium III with 128M RAM
Interface
References
static fields to phantom
should be fields
initialized
Slide 46
Decompiling Java using Staged Encapsulation
Conclusions and future work.






The Structure Encapsulation Tree performs very well.
It is useful to find control flow constructs in an order
determined by their features rather than their locations.
Restructuring speed nearly linear to input program size.
Results are readable and recompilable regardless of
source language.
Improvements may include recognizing opportunities to
create commonly used programming idioms.
It’s available now! You can download it today from:
http://www.sable.mcgill.ca/~jerome/public
(Email: [jerome, hendren]@sable.mcgill.ca)
Related documents