Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Data Structure Analysis: A Fast and Scalable Context-Sensitive Heap Analysis Author: Chris Lattner Vikram Adve University of Illinois at Urbana-Champaign Presenter: Cheng Li Static Program Analysis Seminar 1 Outline ● Background ● Data Structure analysis Overview ● Data Structure Graph ● Construction Algorithm ● Summary 2 Static Program Analysis Seminar Outline ● Background ● Data Structure analysis Overview ● Data Structure Graph ● Construction Algorithm ● Summary 3 Static Program Analysis Seminar Background ● ● Data Structure Analysis enables – analyses of logical data structures. – transformations of logical data structures. Related analysis tool – Alias analysis – Pointer analysis – Shape analysis 4 Static Program Analysis Seminar Limitations of Existing tools ● Hardly apply to entire data structures – ● Not support type-unsafe programs – ● lists, heaps or graphs Incompatible accesses to an object Hardly include recursion 5 Static Program Analysis Seminar Limitations of Existing tools ● Do not correctly handle function pointer ● Not available for incomplete programs ● ● Hardly scale to a program with large number of global variables Not practical for use in commercial compiler 6 Static Program Analysis Seminar Shape vs. Data structure analysis ● Representation of a list with 4 elements X List: H List * int Shape analysis Data Structure analysis 7 Static Program Analysis Seminar Desire Analysis Tool: ● Full context-sensitivity: – Identifying disjoint instances of data structures – Keeping track of calling path – i.e named objects by entire acyclic call Create a list L A Create a list L B C Create a list L Function call Static Program Analysis Seminar A.L A.B.L A.B.C.L 8 Desire Analysis Tool: ● Field-Sensitivity: – Identifying the internal connectivity pattern – Distinguishing different structure fields. 1 List L 2 3 4 1 points to 2 ●2 points to 3 ●3 points to 4 ● 9 Static Program Analysis Seminar Challenging ● ● Full Context-Sensitivity – Entire call path may be very large – Recursion is difficult Full Field-Sensitivity – Not efficient in non-type-safe language 10 Static Program Analysis Seminar Outline ● Background ● Data Structure analysis Overview ● Data Structure Graph ● Construction Algorithm ● Summary 11 Static Program Analysis Seminar Overview of Data Structure Analysis Algorithm ● Compute a Data Structure Graph for each function in a program ● Identify memory objects ● Capture connectivity patterns – Caller and Callee Create a list L A As to B, A is caller ● B Create a list L As to A, B is callee ● Function call 12 Static Program Analysis Seminar Overview of Data Structure Analysis Algorithm ● Field-sensitivity – ● Fully context-sensitivity – ● Type-safe until inconsistence appears Unification based approach Efficient and Scalable 13 Static Program Analysis Seminar Favorite Code 14 Static Program Analysis Seminar Outline ● Background ● Data Structure analysis Overview ● Data Structure Graph ● Construction Algorithm ● Summary 15 Static Program Analysis Seminar DSG Notation 16 Static Program Analysis Seminar DSG Notation 17 Static Program Analysis Seminar Data Structure Graph ● ● ● DS graph is a graph for each function (F) in a program Summarizing the memory objects with – Inter-procedure : F is caller – Intra-procedure : F is callee DS Node: – Represent a set of distinct memory object. – List, graph and etc. 18 Static Program Analysis Seminar Data Structure Graph ● Assumption: – ● A single type system with integer, float, pointers, structures, arrays, and functions types. For any type T, fields(T) returns a set of field names for T. – A array with size k can be represent either a structure with k fields or a single field – An unknown-size array represents a single field 19 Static Program Analysis Seminar Data Structure Graph ● ● Virtual register – Represent scalar variables: integer, float and pointer – All arithmetic operations on Virtual register Memory location – Heap object: malloc – Stack object: alloca – Global object: global variables or functions 20 Static Program Analysis Seminar Data Structure Graph ● DS Graph for a function is a finite directed graph represented as a tuple DSG(F) = <N, E, Ev,C>, where: – N : DS Nodes ● – variables, data structures, functions E : a set of edges ● ● from one field to another field <ns, fs> → <nd, fd> – ns , nd are DS Nodes, – fs in fields( T(ns) ), fd in fields( T(nd) ). 21 Static Program Analysis Seminar Data Structure Graph ● DS Graph for a function is a finite directed graph represented as a tuple DSG(F) = <N, E, Ev,C>, where: – Ev : a set of edges ● – from virtual registers v to the target field <n, f > pointed by v C : a set of call nodes, with a tuple (r, f, a1, a2, ... , ak). ● ● ● r : return value f : function being called a1, ..., ak : arguments for such call 22 Static Program Analysis Seminar Graph Nodes and Fields ● DS Nodes represent a set of memory objects associated with: – T(n) : type information – G(n) : a set of global objects – flags(n) : property flags ● – H,S,G,U,M,R,C and O Fields(T(n)) : a set of fields for the type T(n) 23 Static Program Analysis Seminar Flags for each node ● Memory allocation classes – H : Heap-allocated objects by malloc – S : Stack-allocated objects by alloca – G : Global variables or functions – U : Unknown objects with incomplete information – C : nodes analyzed completely 24 Static Program Analysis Seminar Flags for each node ● Heap, Stack, Global, Unknown examples struct list{ list * Next; int Data; } int size = 10; int F1(list *L, void (*F)(int *,int)){ X Y List: H List: S List * int List * int F(&L->Data, size); } int main(){ list *X = malloc(sizeof(list)); list *Y = alloca(sizeof(list)); F int: G size Void: U } 25 Static Program Analysis Seminar Flags for each node ● Modify (M) and Read (R) struct list{ list * Next; int Data; } int size = 10; void Next(list *L){ L = L->Next; X L List: MR List: R int List * int } void add(int *X) { (*X) + = size; } 26 Static Program Analysis Seminar Type safe vs unsafe ● ● ● Type safe language – All accesses to all objects at node n are a consistent type t. – T(n) = t. Type unsafe : challenging – Incompatible types – Field-sensitivity expensive Solution: – Collapse if incompatible type occurred 27 Static Program Analysis Seminar Type safe vs unsafe ● MergeCells (<n1,f1>, <n2,f2>) – Merge two cells pointing to a field. – Two cases: ● Compatible accesses: n1 = n2, f1 = f2 Merge compatible cells Union flags of n1 into flags of n2 ● Merge out-edge of <n1, f> with corresponding fields of n2 ● Move in-edges of n1 to corresponding fields of n2 ● Discard n1 ● Return <n2, f2> ● 28 Static Program Analysis Seminar Type safe vs unsafe ● MergeCells (<n1,f1>, <n2,f2>) – Merge two cells pointing to a field. – Two cases: ● ● Compatible accesses: n1 = n2, f1 = f2 Incompatible accesses: n1 != n2 or f1 != f2 – Collapse n2 Int *a = malloc(10*sizeof(int)); ... ... void *b = a; b = 0; Collapse a 29 Static Program Analysis Seminar Flag for each node ● Collapse(O) a node when a type-safety violation found Collapse (node a) a) represent a by a single field b) reset type T(a) with void* c) flags(a) = flags(a) U 'O' 30 Static Program Analysis Seminar Outline ● Background ● Data Structure analysis Overview ● Data Structure Graph ● Construction Algorithm ● – Local analysis – Bottom Up – Top Down Summary 31 Static Program Analysis Seminar Construction Algorithm ● DS Graphs are created and refined in three steps: – Local analysis phase: ● – Construct DS graph by intra-procedural connectivity Global analysis phase: ● ● ● Bottom-Up and Top-Down Refine DS graph by inter-procedural connectivity Caller and Callee 32 Static Program Analysis Seminar Local analysis phase - Example void addGToList( list *L) { do_all(L, addG); } call r f Void (list* L, void (int*)*) : GC do_all L void(int *): G addG Void Local graph for a function addGToList 33 Static Program Analysis Seminar Local analysis phase ● Construct rough DS Graph for each function by using local information – Starts with an empty graph – Makenodes for ● ● ● virtual registers (local variables) global variables data structures – Create a call site “C” for each function calling – Merge cells pointing to the same objects 34 Static Program Analysis Seminar Local Graph int Global = 10; void addG(int *X) {(*X) += Global;} void do_all(list *L, void (*FP)(int *)){ do{ FP(&L->Data); L = L->Next; }while(L); } void addGToList( list *L) { do_all(L, addG);} list * makeList (int num){ list *New = malloc(sizeof(list)); New->Next = Num ? makeList(Num – 1) : 0; New->Data = Num ; return New; } int main(){ list *X = makeList(10); addGToList(X); Global = 20; } Statics Program Analysis Seminar Global variable: int: MR Global Virtual registers: X int: MR int 35 Local Graph int Global = 10; void addG(int *X) {(*X) += Global;} void do_all(list *L, void (*FP)(int *)){ do{ FP(&L->Data); L = L->Next; }while(L); } X Global void addGToList( list *L) { do_all(L, addG);} list * makeList (int num){ list *New = malloc(sizeof(list)); New->Next = Num ? makeList(Num – 1) : 0; New->Data = Num ; return New; } int main(){ list *X = makeList(10); addGToList(X); Global = 20; } Static Program Analysis Seminar int: MR int: MR int Local graph for function addG 36 Local Graph int Global = 10; void addG(int *X) {(*X) += Global;} void do_all(list *L, void (*FP)(int *)){ do{ FP(&L->Data); L = L->Next; }while(L); } Virtual registers: L &L->Data List: R int: R int List * int Function Pointer: FP Void 37 Static Program Analysis Seminar Local analysis phase ● Construct rough DS Graph for each function by using local information – Starts with an empty graph – Makenodes for ● ● ● virtual registers (local variables) global variables data structures – Create a call site “C” for each function calling – Merge cells pointing to the same objects 38 Static Program Analysis Seminar Local Graph void do_all(list *L, void (*FP)(int *)) { do{ FP(&L->Data); L = L->Next; }while(L); } Call site: f L &L->Data List: R int: R int List * int call r Virtual registers: a Function Pointer: FP Void int: R int Void 39 Static Program Analysis Seminar Local analysis phase ● Construct rough DS Graph for each function by using local information – Starts with an empty graph – Makenodes for ● ● ● virtual registers (local variables) global variables data structures – Create a call site “C” for each function calling – Merge cells pointing to the same objects 40 Static Program Analysis Seminar Local Graph - merge void do_all(list *L, void (*FP)(int *)) { do{ FP(&L->Data); L = L->Next; }while(L); } Call site: f L &L->Data List: R int: R int List * int call r Virtual registers: a Function Pointer: FP Void int: R int Void 41 Static Program Analysis Seminar Local Graph - merge void do_all(list *L, void (*FP)(int *)) { do{ FP(&L->Data); L = L->Next; }while(L); } Local graph being constructed: r f Void L &L->Data List: R int: R List * int call FP Virtual registers: int a int: R int 42 Static Program Analysis Seminar Local Graph - merge void do_all(list *L, void (*FP)(int *)) { do{ FP(&L->Data); L = L->Next; }while(L); } Local graph being constructed: r f Void L &L->Data List: R int: R List * int call FP Virtual registers: int a int: R int 43 Static Program Analysis Seminar Local Graph - merge void do_all(list *L, void (*FP)(int *)) { Virtual registers: do{ FP(&L->Data); L = L->Next; }while(L); L } Local graph being constructed: List: R List * int call FP r f Void &L->Data a int: R int 44 Static Program Analysis Seminar Local Graph - merge void do_all(list *L, void (*FP)(int *)) { Virtual registers: do{ FP(&L->Data); L = L->Next; }while(L); L } Local graph being constructed: List: R List * int call FP r f Void &L->Data a int: R int 45 Static Program Analysis Seminar Local Graph - merge void do_all(list *L, void (*FP)(int *)) { do{ FP(&L->Data); L = L->Next; }while(L); } call FP L Void r f a &L->Data List: R List * int Final local graph do_all 46 Static Program Analysis Seminar Local Graph void addGToList( list *L) { do_all(L, addG);} call r f Void (list* L, void (int*)*) : GC do_all L void(int *): G addG Void Limitation: Two callee functions are not resolved since information is not complete. 47 Static Program Analysis Seminar Global Analysis Phase ● BU analysis – Eliminate incomplete information in DS Graph for a function F – Inline operation ● ● – Clone callees' graph into F Merge arguments Four cases: ● ● ● ● Simplest one: no recursion, no function pointers Have function pointers, but no recursion Recursion without function pointers Recursion with function pointers Static Program Analysis Seminar 48 No function pointers, no recursion do_add local graph: Code: void do_add(int *a) { (*a)++; } *a void add(int b) { do_add(b); } int: MR int add local graph: call r f arg b Void do_add(int *a) int: R int 49 No function pointers, no recursion Code: BU analysis: void do_add(int *a) { (*a)++; } ● Copy callee's graph into caller's graph void add(int b) { do_add(b); } add local graph: call r f arg b Void do_add(int *a) int: R int *a int: MR int 50 No function pointers, no recursion Code: void do_add(int *a) { (*a)++; } BU analysis: Inline callee's graph into caller's graph ● Merge formal arguments and actual arguments ● void add(int b) { do_add(b); } add graph: call r f arg b Void do_add(int *a): C int: MR int 51 No function pointers, no recursion Code: void do_add(int *a) { (*a)++; } BU analysis: Inline callee's graph into caller's graph ● Merge formal arguments and actual arguments ● void add(int b) { do_add(b); } add graph: add BU graph: call r f b arg b Void do_add(int *a): C int: MR int int: MR int 52 Function pointers, no recursion Code: int Global = 10; BU phase: ● void addG(int *X) {(*X) += Global;} void do_all(list *L, void (*FP)(int *)){ do{ FP(&L->Data); L = L->Next; }while(L); } ● Inline all function pointers if known TD phase handles all unknown function pointers void addGToList( list *L) { do_all(L, addG);} Examples: ● ● FP can not be resolved in do_all, no caller information. But, addG can be resolved in addGToList, since addG is known. 53 Function pointers, no recursion int Global = 10; do_all local graph void addG(int *X) {(*X) += Global;} FP void do_all(list *L, void (*FP)(int *)){ do{ FP(&L->Data); L = L->Next; }while(L); } call r r L f a &L->Data List: R List * int Void void addGToList( list *L) { do_all(L, addG);} addGToList local graph call L f Void (list* L, void (int*)*) : GC do_all void(int *): G addG Void 54 Function pointers, no recursion int Global = 10; do_all local graph void addG(int *X) {(*X) += Global;} FP void do_all(list *L, void (*FP)(int *)){ do{ FP(&L->Data); L = L->Next; }while(L); } L Void void addGToList( list *L) { do_all(L, addG);} call r f a &L->Data List: R List * int addGToList BU graph by inlining do_all call L Void (int*: GC) addG r f a List: R List * int 55 Function pointers, no recursion int Global = 10; do_all local graph void addG(int *X) {(*X) += Global;} FP void do_all(list *L, void (*FP)(int *)){ do{ FP(&L->Data); L = L->Next; }while(L); } call r L f addGToList BU graph by inlining do_all &L->Data List: R List * int Void void addGToList( list *L) { do_all(L, addG);} a addG graph call L r f a X int: MR Global int: MR Void (int*: GC) addG List: R List * int int 56 Function pointers, no recursion Code: do_all local graph int Global = 10; FP void addG(int *X) {(*X) += Global;} void do_all(list *L, void (*FP)(int *)){ do{ FP(&L->Data); L = L->Next; }while(L); } L Void call r f a &L->Data List: R List * int void addGToList( list *L) { do_all(L, addG);} addGToList Final BU graph by inlining add_G L int: GR Global List: MR List * int 57 Recursion B A C D Recursion 58 Static Program Analysis Seminar Recursion ● – B A Problem: C D Strong Connected Components(SCC) ● Infinite call paths in SCC Solution: – Ignore contextsensitivity in SCC – Merge arguments for each call 59 Static Program Analysis Seminar Recursion int Global = 10; list * makeList (int num){ list *New = malloc(sizeof(list)); New->Next = Num ? makeList(Num – 1) : 0; New->Data = Num ; return New; } int main(){ list *X = makeList(10); list *Y = makeList(100); addGToList(X); Global = 20; } Main BU graph X int: GRMC Y List: MRHC List: MRHC List * int List * int Global 60 Global Analysis Phase ● Top-Down Phase – Similar with BU phase but reverse order – Inline caller's graph into callee's – Eliminate incomplete information int Global = 10; void addG(int *X) {(*X) += Global;} void do_all(list *L, void (*FP)(int *)){ do{ FP(&L->Data); L = L->Next; }while(L); } void addGToList( list *L) { do_all(L, addG);} 61 Static Program Analysis Seminar Global graph ● Final result after three steps: Main Final graph X int: GMRC Y List: MRHC List: MRHC List * int List * int Global 62 Summary ● ● Data Structure Analysis – Analyze and transform entire data structure – Identify disjoint memory objects – Full context-sensitivity and field-sensitivity Data Structure Graph – Definition – Construction – Examples 63 Static Program Analysis Seminar Reference ● [1] Chris Lattner, Vikram Adve, Data Structure Analysis: A Fast and Scalable Context-Sensitive Heap Analysis. 64 Static Program Analysis Seminar End Thanks for your attention! Questions? 65 Static Program Analysis Seminar