Download Modular interprocedural pointer analysis using access paths

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Control table wikipedia , lookup

Transcript
Pointer Analysis Survey.
Rupesh Nasre.
Aug 24, 2007.
Outline.
●
●
●
●
●
The problem.
Background.
Representative papers.
Discussion: trends, similarities, differences.
Directions for research.
The problem.
Statically find out the groups of program variables,
such that, all variables in a group may point to the
same memory block during the program execution.
Background (1 of 7).
●
●
Static analysis.
➢
done on static representation of a program.
➢
does not require program execution.
➢
is conservative by definition.
Dynamic analysis.
➢
done on traces of program executions.
➢
does not cover all possible behaviors.
➢
precise for a run of the program.
Background (2 of 7).
●
Clients.
➢
program transformations that depend on pointer
analysis.
➢
for instance, queries related to pointers and compiler
optimizations.
➢
typically, query resolution time for clients is inversely
proportional to pointer analysis time.
Background (3 of 7).
●
Precision.
➢
➢
➢
a measure of correctness for getting the required
information from pointer analysis.
for pointer analysis, the required information is:
whether two pointers are aliases or non-aliases.
dynamic analysis is precise with respect to that
execution.
Background (4 of 7).
●
Efficiency.
➢
●
amount of time taken by an algorithm.
Scalability.
➢
asymptotic time complexity of an algorithm.
An algorithm can be efficient, but not scalable.
Background (5 of 7).
●
Flow-sensitivity.
➢
●
Context-sensitivity.
➢
●
algorithm considers control flow in the program.
algorithm considers calling context of a function.
Field-sensitivity.
➢
algorithm separates individual fields of an aggregate,
from each other and from the aggregate itself.
Background (6 of 7).
●
Unification-based.
➢
algorithm merges equivalence classes of variables in
an assignment.
➢
less storage requirement.
➢
fast.
➢
low precision.
Background (7 of 7).
●
Inclusion based (or subset based or constraint
based).
➢
algorithm processes assignments directionally and
each symbol is represented by a single node.
➢
more storage requirement.
➢
slower.
➢
high precision.
Representative papers (1 of 4).
●
●
●
●
Choi et al, Efficient flow-sensitive interprocedural
computation of pointer-induced aliases and side
effects, POPL 1993.
Andersen, PhD Thesis, 1994.
Burke et al, Flow-insensitive interprocedural
alias analysis in the presence of pointers, LCPC
1995.
Reps et al, Precise interprocedural dataflow
analysis via graph reachability, POPL 1995.
Representative papers (2 of 4).
●
●
●
Steensgaard, Points-to analysis in almost linear
time, POPL 1996.
Ghiya et al, Is it a tree, DAG, or a cyclic graph?
A shape analysis for heap-directed pointers in C,
PLDI 1996.
Hind et al, Which pointer analysis should I use?,
ISSTA 2000.
Representative papers (3 of 4).
●
●
●
Cheng et al, Modular interprocedural pointer
analysis using access paths: design,
implementation, and evaluation, PLDI 2000.
Liang et al, Evaluating the precision of static
reference analysis using profiling, ISSTA 2002.
Whaley et al, Cloning-based context-sensitive
pointer alias analysis using binary decision
diagrams, PLDI 2004.
Representative papers (4 of 4).
●
●
Raman et al, Recursive data structure profiling,
MSP 2005.
Lattner et al, Making context sensitive points-to
analysis with heap-cloning practical for the real
world, PLDI 2007.
Discussion: similarities, differences.
●
●
●
●
●
Flow-sensitive: Choi93, Ghiya96, Reps95,
Whaley04.
Context-sensitive: Andersen94, Cheng00,
Ghiya96, Lattner07, Whaley04.
Field-sensitive: Cheng00, Lattner07, Whaley04.
Unification-based: Steensgaard96, Lattner07.
Inclusion-based: Andersen94, Cheng00,
Whaley04.
Discussion: trends (1 of 2).
●
●
●
●
Recursion is handled using strongly-connected
components.
A recursive data structure is represented using a
single representative node.
Stack pointers are often treated in a different
manner than heap pointers.
For better precision, inclusion-based analyses are
preferred. For better efficiency, unification-based
analyses are preferred.
Discussion: trends (2 of 2).
●
●
●
Flow-sensitivity does not improve precision to a
significant extent, for, typically pointers are not
reassigned and when they are, they point to the
other part of the same data structure represented
as a whole using a single node.
Graph algorithms typically involve three phases:
intraprocedural, bottom-up, and top-down.
Single level of context-sensitivity proves
sufficiently precise and efficient.
Discussion.
●
●
Most of the papers differ in the techniques used to
solve pointer analysis problem.
Representation of alias information differs a lot
across techniques.
➢
matrices: Ghiya96.
➢
graphs: Das00, Lattner07, Raman05, Reps95,
Steensgaard96.
➢
access-paths: Cheng00.
➢
ordered binary decision diagrams: Whaley04.
Directions for research (1 of 4).
●
Complex data structures.
➢
most algorithms do not handle them well.
➢
occur when large hash tables, dictionaries, symbol
tables form the main data structure of a program.
➢
need to characterize complexity of a data structure.
➢
adaptive algorithm depending on the complexity.
Directions for research (2 of 4).
●
Out-of-order execution for multithreaded
programs.
➢
some research done for multithreaded programs.
➢
none of the papers talk about the result of out-of-order
execution of instructions on aliases in multithreaded
programs.
➢
instructions may be reordered by compiler or
hardware.
Directions for research (3 of 4).
●
Combination of techniques.
➢
no one of the techniques present is best in all aspects.
➢
hybrid approaches are necessary.
➢
one way is to combine static pointer analysis with
dynamic profile information.
➢
another way is to use adaptive algorithm which
internally uses different sub-algorithms invented.
Directions for research (4 of 4).
●
Representation of alias information.
➢
history tells us that difference in the alias information
representation often led to new algorithms.
➢
research on finding novel ways to represent aliases
can be an interesting area to be explored.
Pointer Analysis Survey.
Rupesh Nasre.
Aug 24, 2007.