Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Pointer Analysis Survey. Rupesh Nasre. Aug 24, 2007. Outline. ● ● ● ● ● The problem. Background. Representative papers. Discussion: trends, similarities, differences. Directions for research. The problem. Statically find out the groups of program variables, such that, all variables in a group may point to the same memory block during the program execution. Background (1 of 7). ● ● Static analysis. ➢ done on static representation of a program. ➢ does not require program execution. ➢ is conservative by definition. Dynamic analysis. ➢ done on traces of program executions. ➢ does not cover all possible behaviors. ➢ precise for a run of the program. Background (2 of 7). ● Clients. ➢ program transformations that depend on pointer analysis. ➢ for instance, queries related to pointers and compiler optimizations. ➢ typically, query resolution time for clients is inversely proportional to pointer analysis time. Background (3 of 7). ● Precision. ➢ ➢ ➢ a measure of correctness for getting the required information from pointer analysis. for pointer analysis, the required information is: whether two pointers are aliases or non-aliases. dynamic analysis is precise with respect to that execution. Background (4 of 7). ● Efficiency. ➢ ● amount of time taken by an algorithm. Scalability. ➢ asymptotic time complexity of an algorithm. An algorithm can be efficient, but not scalable. Background (5 of 7). ● Flow-sensitivity. ➢ ● Context-sensitivity. ➢ ● algorithm considers control flow in the program. algorithm considers calling context of a function. Field-sensitivity. ➢ algorithm separates individual fields of an aggregate, from each other and from the aggregate itself. Background (6 of 7). ● Unification-based. ➢ algorithm merges equivalence classes of variables in an assignment. ➢ less storage requirement. ➢ fast. ➢ low precision. Background (7 of 7). ● Inclusion based (or subset based or constraint based). ➢ algorithm processes assignments directionally and each symbol is represented by a single node. ➢ more storage requirement. ➢ slower. ➢ high precision. Representative papers (1 of 4). ● ● ● ● Choi et al, Efficient flow-sensitive interprocedural computation of pointer-induced aliases and side effects, POPL 1993. Andersen, PhD Thesis, 1994. Burke et al, Flow-insensitive interprocedural alias analysis in the presence of pointers, LCPC 1995. Reps et al, Precise interprocedural dataflow analysis via graph reachability, POPL 1995. Representative papers (2 of 4). ● ● ● Steensgaard, Points-to analysis in almost linear time, POPL 1996. Ghiya et al, Is it a tree, DAG, or a cyclic graph? A shape analysis for heap-directed pointers in C, PLDI 1996. Hind et al, Which pointer analysis should I use?, ISSTA 2000. Representative papers (3 of 4). ● ● ● Cheng et al, Modular interprocedural pointer analysis using access paths: design, implementation, and evaluation, PLDI 2000. Liang et al, Evaluating the precision of static reference analysis using profiling, ISSTA 2002. Whaley et al, Cloning-based context-sensitive pointer alias analysis using binary decision diagrams, PLDI 2004. Representative papers (4 of 4). ● ● Raman et al, Recursive data structure profiling, MSP 2005. Lattner et al, Making context sensitive points-to analysis with heap-cloning practical for the real world, PLDI 2007. Discussion: similarities, differences. ● ● ● ● ● Flow-sensitive: Choi93, Ghiya96, Reps95, Whaley04. Context-sensitive: Andersen94, Cheng00, Ghiya96, Lattner07, Whaley04. Field-sensitive: Cheng00, Lattner07, Whaley04. Unification-based: Steensgaard96, Lattner07. Inclusion-based: Andersen94, Cheng00, Whaley04. Discussion: trends (1 of 2). ● ● ● ● Recursion is handled using strongly-connected components. A recursive data structure is represented using a single representative node. Stack pointers are often treated in a different manner than heap pointers. For better precision, inclusion-based analyses are preferred. For better efficiency, unification-based analyses are preferred. Discussion: trends (2 of 2). ● ● ● Flow-sensitivity does not improve precision to a significant extent, for, typically pointers are not reassigned and when they are, they point to the other part of the same data structure represented as a whole using a single node. Graph algorithms typically involve three phases: intraprocedural, bottom-up, and top-down. Single level of context-sensitivity proves sufficiently precise and efficient. Discussion. ● ● Most of the papers differ in the techniques used to solve pointer analysis problem. Representation of alias information differs a lot across techniques. ➢ matrices: Ghiya96. ➢ graphs: Das00, Lattner07, Raman05, Reps95, Steensgaard96. ➢ access-paths: Cheng00. ➢ ordered binary decision diagrams: Whaley04. Directions for research (1 of 4). ● Complex data structures. ➢ most algorithms do not handle them well. ➢ occur when large hash tables, dictionaries, symbol tables form the main data structure of a program. ➢ need to characterize complexity of a data structure. ➢ adaptive algorithm depending on the complexity. Directions for research (2 of 4). ● Out-of-order execution for multithreaded programs. ➢ some research done for multithreaded programs. ➢ none of the papers talk about the result of out-of-order execution of instructions on aliases in multithreaded programs. ➢ instructions may be reordered by compiler or hardware. Directions for research (3 of 4). ● Combination of techniques. ➢ no one of the techniques present is best in all aspects. ➢ hybrid approaches are necessary. ➢ one way is to combine static pointer analysis with dynamic profile information. ➢ another way is to use adaptive algorithm which internally uses different sub-algorithms invented. Directions for research (4 of 4). ● Representation of alias information. ➢ history tells us that difference in the alias information representation often led to new algorithms. ➢ research on finding novel ways to represent aliases can be an interesting area to be explored. Pointer Analysis Survey. Rupesh Nasre. Aug 24, 2007.