Download Automated Software Verification

Document related concepts
no text concepts found
Transcript
Java Race Finder
Checking Java Programs for
Sequential Consistency
Tuba Yavuz-Kahveci
Fall 2013
Outline
 The Problem: Getting Multithreaded Java Programs Right
 Java Memory Model
 Our Solution: Java Race Finder
 What is model checking anyway?
 Representing Happens-before
 Heuristic-based Search
 Code Modification Suggestions
What is Sequential Consistency?
 Program statements are executed according to program
order
 Each thread’s statements are executed according to the
program order in that thread’s code
 Write atomicity
 Each read operation on a variable sees the most recent write
operation on that variable
What is a Memory Model?
 Constrains the behavior of memory operations
 What value can a read operation see?
 Example memory models
 Sequential Consistency
 Easy to understand
 Relaxed Consistency Models
 Relaxation of
 Program order
 Write atomicity
Who Should Care?
 Programmers
 Understanding how to achieve sequential consistency, if
possible
 Reasoning about correctness
 Compiler writers
 Optimizing code within the restrictions of the memory model
Problem: Getting Multi-threaded Java
Programs Right
 Important Questions Any Java Programmer Should Ask
 Is my multithreaded program correctly synchronized?
 Beware!!! Sequential consistency is not guaranteed for incorrectly
synchronized Java programs!
 If my multithreaded program is not correctly synchronized, how
can I fix it?
 If my multithreaded program is not correctly synchronized for a
good reason, should I still be worried?
 Automated tool support is needed to answer these nontrivial
questions
An Example: Peterson’s Mutual Exclusion
Algorithm - Version 1
Initialization: flag[0] = flag[1] = turn = shared = 0 /* all non-volatile */
Thread 1
Thread 2
s1: flag[0] = 1;
s6: flag[1] = 1;
s2: turn = 1;
s7: turn = 0;
s3: while (flag[1] == 1 && turn == 1)
{ /*spin*/}
s8: while (flag[0] == 1 && turn == 0)
{ /*spin*/}
s4: shared++; /* critical section */
s9: shared++; /* critical section */
s5: flag[0] = 0;
s10: flag[0] = 0;
Outline
 The Problem: Getting Multithreaded Java Programs Right
 Java Memory Model
 Our Solution: Java Race Finder
 What is model checking anyway?
 Representing Happens-before
 Heuristic-based Search
 Code Modification Suggestions
What is Java Memory Model (JMM)?
 A relaxed memory model
 Sequential consistency is guaranteed only for correctly
synchronized programs
 For programs without data races
 Incorrectly synchronized programs can show extra behavior
that is not sequentially consistent
 Still subject to some safety rules
Synchronization Rules in Java
 Some synchronization actions and their relationship in Java:
 Unlocking a monitor lock synchronizes with locking that monitor




lock.
Writing a volatile variable synchronizes with reading of that variable.
Starting a thread synchronizes with the first action of that thread.
Final action in a thread synchronizes with any action of a thread that
detects termination of that thread.
Initialization of a field synchronizes with the first access to the field
in every thread.
 In general a release action synchronizes with a matching acquire
action.
Happens-Before Relation
 An action a1 happens-before action a2, a1 ≤hb a2, due to one
of the following:
 a1 comes before a2 according to program order: a1 ≤po a2.
 a1 synchronizes with a2: a1 ≤sw a2.
 a1 happens-before a’ that happens-before a2: Exists a’. a1 ≤hb a’
and a’ ≤hb a2 (transitivity).
 Happens-before, ≤hb = ( ≤po U ≤sw )+ , is a partial-order on all
actions in an execution.
Happens-before Consistency
 A read operation r can see results of a write operation w
provided that:
 r does not happen-before w: not (r ≤hb w).
 There is no intervening write operation: not (exists w’. w r ≤hb
w’ ≤hb r).
Anatomy of a Data Race
 Definition: If two actions a1 and a2 from different threads access
the same memory location loc, the actions are not ordered by
happens-before and if one of the actions is a write, then there is a
data race on loc.
 Example:
Initialization: boolean done = false; /* non-volatile */
≤hb
≤hb
Thread 1
Thread 2
result = compute();
≤hb
done = true;
Race on
done!!!
if (done)
≤hb
// use result
A Simple Fix
 A write to a volatile variable synchronizes with a read of that
variable.
 Example:
Initialization: volatile boolean done = false;
≤hb
≤hb
Thread 1
result = compute();
≤hb
≤
done =hbtrue;
Thread 2
≤hb
if (done) ≤hb
≤hb
Not in a race
// use result
Outline
 The Problem: Getting Multithreaded Java Programs Right
 Java Memory Model
 Our Solution: Java Race Finder
 What is model checking anyway?
 Representing Happens-before
 Heuristic-based Search
 Code Modification Suggestions
Our Solutions/Contributions
 Is my multi-threaded program correctly synchronized?
Kim K., Yavuz-Kahveci T., Sanders B.Precise Data Race detection in Relaxed
Memory Model using Heuristic based Model Checking [ASE Conf. 2009]
 If my multi-threaded program is not correctly synchronized, how
can I fix it?
Kim K., Yavuz-Kahveci T., Sanders B. JRF-E: Using Model Checking to give Advice
on Eliminating Memory Model-related Bugs [ASE Conf. 2010, ASE Journal 2012]
 If my program is not correctly synchronized for a good reason,
should I still be worried?
Jin H., Yavuz-KahveciT., Sanders B. Java Path Relaxer: Extending JPF for JMM-aware
model checking [JPF Workshop]
Jin H., Yavuz-KahveciT., Sanders B. Java Memory Model-Aware Model Checking [TACAS
2012]
Outline
 The Problem: Getting Multithreaded Java Programs Right
 Java Memory Model
 Our Solution: Java Race Finder
 What is model checking anyway?
 Representing Happens-before
 Heuristic-based Search
 Code Modification Suggestions
State/Snapshot of a Running Java
Program
JAVA VIRTUAL MACHINE
Values of
Static
Fields
Heap
(objects)
Thread
states
Bytecode for
the Java
program
Model Checking Java Programs
Values of
Static
Fields
Thread
states
Heap
(objects)
Main Thread
Thread1
Thread2
Thread3
…
Main Thread Main Thread
Thread2
Thread3
Thread1
Thread2
Thread3
Thread1
…
…
Model Checking for Sequential
Consistency
Multi-threaded
Java application
yes
Java Race Finder
(JRF)
• extends JPF’s state
representation to detect
data races
Data
Race?
no
Java Path
Finder (JPF)
• a model-checker for Java
programs
• checks for general
correctness properties
• assumes sequential
consistency
• explores all possible
thread interleaving
Our Approach for Detecting Data Races
Algorithm:
for each execution path EPj=<a1, a2, …, an> of program P do
initialize happens-before relation
for each action ai , i= 1 to n, do
let loc be the memory location ai accesses
if (it is safe (without a data race) for ai to access loc)
generate DATA RACE error
execute ai
update happens-before relation
Representing Happens-Before
 We define an h-function that captures the happens-before
relation in an implicit way.
 h: SyncAddr U Thread -> 2Addr .
 SyncAddr: Volatile variables and locks
 Addr: Non-volatile variables
 Is it safe for aj of thread ti to access loc?
 does h(ti ) contain loc?
 Which variables can be safely accessed if acquire on s (with a
matching release on s) is executed?
 h(s).
The h-function
 Initialization:
 At the beginning there is only the main thread:
 h0 = λz.if z = main then static(P) else φ
 Update:
 Executing an action updates the h-function:
 action(t, x) h = h’
 h: h-function before executing action
 t: the thread the action belongs to
 x: synchronization variable (volatile or a lock)
 h’: the updated h function
Updating the h-function
action an by thread t
hn+1
write a volatile field v
release(t,v) hn
read a volatile field v
acquire(t, v) hn
lock the lock variable lck
acquire(t, lck) hn
unlock the lock variable lck
release(t,lck) hn
start thread t′
release(t,t′) hn
join thread t′
acquire(t, t′) hn
t′.isAlive() returns false
acquire(t, t′) hn
write a non-volatile field x
invalidate(t, x) hn
read a non-volatile field x
hn
instantiate an object containing non-volatile fields fields and
volatile fields volatiles
new (t, fields , volatiles ) hn
Action Semantics

Variables that can be safely accessed from thread t copied to the set for synchronization
variable x
release(t, x)h = h[x →h(t)∪h(x)]

Variables in the set of synchronization variable x will now be safely accessed by thread t
acquire(t, x)h = h[t →h(t)∪h(x)]

Only thread t which changed x can safely access it.
invalidate(t, x) h = λz. if (t = z) then h(z) else h(z)\{x}

The non-volatile fields of the newly created object can be safely accessed by the thread who
created it. The volatile fields are initialized to refer to empty sets.
new(t, fields, volatiles)h = λz. if (t = z) then h(t) ∪ fields else if (z ∈ volatiles) then{} else h(z)
Implementation of the h-function
How JRF extends JPF
Test Programs
Sources
# of examples
# of examples found to
have data races
Textbook by Herhily and
Shavitz.
65
19
Amino Concurrent Building
Blocks Library
10
9
Google Concurrent Data
Structures Workshop.
12
10
Java Grande Forum
Benchmark Suite
10
6
Webserver Simulator –
Student Projects
28
7
Time Overhead of JRF
Time (secs)
1000
100
JPF
JRF
10
1
0
2
4
6
Test Programs
8
10
12
Space Overhead of JRF
Memory (MB)
1000
100
JPF
JRF
10
1
0
2
4
6
Test Programs
8
10
12
Outline
 The Problem: Getting Multithreaded Java Programs Right
 Java Memory Model
 Our Solution: Java Race Finder
 What is model checking anyway?
 Representing Happens-before
 Heuristic-based Search
 Code Modification Suggestions
Finding the data race quickly
initial
state
State space of a program
race
race
race
Each path from initial state to a leaf state represents a separate execution.
Finding the data race using DFS
initial
state
State space of a program
DFS
counter-example
race
race
race
Each path from initial state to a leaf state represents a separate execution.
Finding the data race using BFS
initial
state
State space of a program
BFS
counter-example
race
race
race
Each path from initial state to a leaf state represents a separate execution.
Heuristic-Based Data Race Search
 Our goal is to reach a state that has a data race as quick as
possible.
 Assign a traversal priority to each program state based on
how close it may be to a racy state.
 Writes-First (WF): Prefer write statements to read statements
 Watch-Written (WW): Prefer access to memory locations
recently written by another thread
 Avoid Release/Acquire (ARA): Avoid scheduling threads that
perform proper synchronization.
 Acquire-First (AF): Prefer acquire operations that do not have a
matching release operation.
An Example: Peterson’s Mutual Exclusion
Algorithm - Version 1
Initialization: flag[0] = flag[1] = turn = shared = 0 /* all non-volatile */
Thread 1
Thread 2
s1: flag[0] = 1;
s6: flag[1] = 1;
s2: turn = 1;
s7: turn = 0;
s3: while (flag[1] == 1 && turn == 1)
{ /*spin*/}
s8: while (flag[0] == 1 && turn == 0)
{ /*spin*/}
s4: shared++; /* critical section */
s9: shared++; /* critical section */
s5: flag[0] = 0;
s10: flag[0] = 0;
DFS vs Heuristic Search
DFS
Search
Path
Thread 1
s1: flag[0] = 1;
Heuristic
Search
Path
s2: turn = 1;
Thread 1
s1: flag[0] = 1;
s2: turn = 1;
s3: while (flag[1] == 1 && turn == 1)
{ /*spin*/}
Thread 2
s4: shared++; /* critical section */
s6: flag[1] = 1;
s5: flag[0] = 0;
s7: turn = 0;
Thread 2
s6: flag[1] = 1;
s7: turn = 0;
Race!
turn not in
h(thread2)!
Experimental Results: Heuristic Search
Code
(lines of code)
Search
DisBarrier
(232)
DFS
Heuristic
BFS
109
79
2589
109
39
36
4
3
256
53
46
644
Moldyn
(1252)
DFS
Heuristic
BFS
2821
1896
5127
2821
950
>574*
231
257
1014
579
518
988
DEQueue
(334)
DFS
Heuristic
BFS
33
19
30
28
12
9
1
1
2
27
26
31
BinaryStaticTree DFS
Barrier (1910)
Heuristic
BFS
61
137
2275
61
52
>18*
7
9
2221
66
86
986
*: JPF ran out of memory
State
Length
Time
(sec)
Memory
(MB)
Outline
 The Problem: Getting Multithreaded Java Programs Right
 Java Memory Model
 Our Solution: Java Race Finder
 What is model checking anyway?
 Representing Happens-before
 Heuristic-based Search
 Code Modification Suggestions
What went wrong?
Thread 1
s1: flag[0] = 1;
s2: turn = 1;
Thread 2
source
statement
removes
turn from h(thread2)
s6: flag[1] = 1;
s7: turn = 0;
manifest
statement
accesses turn when
turn is not in h(thread2)
How to fix it?
 Data races are due to absence of happens-before relationship
 Suggest code modifications that will create happens-before
relationship between the source and manifest statements
 Change the variable to volatile
 Change the array to an atomic array
 Move the source statement to make use of existing happens-
before relationships due to transitivity
 Perform the same synchronization
 Change another variable to volatile to create happens-before
relationships due to transitivity
Change to atomic array
Thread 1
Peterson’s ME Alg.
turn and flag are
volatile
s1: flag[0] = 1;
s2: turn = 1;
Thread 2
Change
flag to
atomic
array
s6: flag[1] = 1;
removes
flag[1] from h(thread1)
source
statement
Thread 1
s3: while (flag[1] == 1 && turn == 1)
{ /*spin*/}
manifest
statement
Accesses flag[1] when
flag[1] is not in h(thread1)
An Example for move source
Initialization: goFlag = false; volatile Data publish;
Thread 2
Thread 1
s1: r = new Data();
t1: if (publish != null) {
s2: publish = r;
t2:
while (!goFlag);
s3: r.setDesc(e);
t3:
String s = publish.getDesc();
s4: goFlag = true;
t4:
assert(s.equals(“e”);
• Updates published object
after making the reference
visible
• Compiler may reorder s3
and s4
}
• May use the published object
when it is in an inconsistent state
Move source statement
Thread 1
publish is volatile
goFlag is not
volatile
s1: r = new Data();
s4: goFlag = true;
s2: publish = r;
s3: r.setDesc(e);
Move s4
before s2
s4: goFlag = true;
removes
goFlag from h(thread2)
source
statement
Thread 2
t1: if (publish != null) {
t2:
while (!goFlag);
Accesses goFlag when
goFlag is not in h(thread2)
manifest
statement
An Example for perform the same
synchronization operation
Initialization: int data; final Object lock = new Object();
Thread 2
Thread 1
s1: print (data);
t1: synchronized (lock) { /*lock*/
t2:
data = 1;
t3: } /*unlock*/
• For every non-volatile variable v, acquireHistory(v) stores
the set of safe accesses by thread t via a synchronization
operation on s.
• Thread2’s safe access on data is noted as an example
behavior.
Perform that synchronized block
Thread 2
data is not
volatile
t1: synchronized (lock) { /*lock*/
t2:
Perform
synchronized
(lock) to
access data
data = 1;
t3: } /*unlock*/
removes
data from h(thread1)
source
statement
Thread 1
s0: synchronized (lock) { /*lock*/
s1: print (data);
s2: } /*unlock*/
Accesses data when
data is not in h(thread1)
manifest
statement
An Example for change another to volatile
Initialization: int x; boolean done = false; /* both nonvolatile*/
Thread 1
Thread 2
s1: x = 1;
t1: while (!done);
s2: done = true;
t2: assert(x == 1);
• Potential data races both on x and done.
• Should we really change both to x and done to volatile?
• Can we get away by changing only one?
Change other to volatile
x and done are
not volatile
Thread 1
s1: x = 1;
s2: done = true;
Change done
to volatile
removes
x from h(thread2)
source
statement
Thread 2
t1: while (!done);
t2: assert(x == 1);
manifest
statement
accesses x when
x is not in h(thread2)
JRF-E: Eliminating Data Races
 JRF is configured to produce threshold # of counter-example
paths and write to a file
 JRF-E works on the output of JRF and analyzes the counter-
example paths to generate code modification suggestions
 For each race
 reports intersection of suggestions on all the relevant counter-
example paths
 For each specific code modification suggestion
 reports the frequency
How did
it
JRF-E RESULT
======================================================
data race #1
happen?
jrf.hbset.util.HBDataRaceException
.
.
.
How to
______________________________________________________ analyze counter example
fix
it?race source statement : "putstatic" at simple/SimpleRace.java:64 : "x = 1;"
data
data race manifest statement : "getstatic" at simple/SimpleRace.java:74: "assert (x==1);"
feedback on
a single race
Change the field "simple.SimpleRace.x from INITIALIZER" to volatile.
Change the field "simple.SimpleRace.done from INITIALIZER" to volatile.
______________________________________________________ advice from acquiring history
NONE
====================================================== data race #2
jrf.hbset.util.HBDataRaceException . . .
______________________________________________________ analyze counter example
data race source statement : "putstatic" at simple/SimpleRace.java:65 : "done = true;"
data race manifest statement : "getstatic" at simple/SimpleRace.java:73: "while(!done) { /*spin*/ }"
How many times a
Change the field "simple.SimpleRace.done
INITIALIZER"
suggestionfrom
has
been to volatile.
made considering all
______________________________________________________
advice from acquiring history
NONE
the races?
______________________________________________________ frequency of advice
[1times] Change the field "simple.SimpleRace.x from INITIALIZER" to volatile.
[2times] Change the field "simple.SimpleRace.done from INITIALIZER" to volatile.
______________________________________________________ statistic
JRF takes 0:0:1 to find 2 equivalent races with 9 counterexample traces.
JRF-E takes 0:0:0 in 9 races analysis.
feedback on
another
race
feedback on all
races
JRF-E - Analyzing threshold # of races
Time (secs)
1000
100
10
Threshold=100
1
0.1
Threshold=10
Threshold=1
In all except MCSLock, the right suggestion made when Threshold <= 10.
Suggestions that worked
Length
Threshold
# of Racy
Fields
Change
(other) to
volatile
Change
to atomic
array
DisBarrier
40
1
2
1
LockFreeHashS
et
50
1
4
1
OptimisticList
42
1
3
1
MCSLock
65
100
3
2
LinearSenseBar 61
rier.
10
2
1
Iterator_EBDeq
ue
11
1
1
1
Lufact
19
1
1
1
Sor
44
1
2
1
Webserver Sim.
68
1
2
Use
synchronized
block
1
Conclusion
 Even experts can benefit from tool support for detecting data
races.
 JRF can also analyze synchronization idioms that do not use
locking.
 Has become an official extension of Java Path Finder
 http://babelfish.arc.nasa.gov/trac/jpf
 JRF-E makes working suggestions for most of the data races
in our experiments.
 JRF-E can teach programmers the intricacies of Java Memory
Model.
Thank You
Questions?