Download An Introduction to Topological Data Analysis

Document related concepts
no text concepts found
Transcript
An Introduction to Topological Data Analysis
Elizabeth Munch
Duke University :: Dept of Mathematics
Tuesday, June 11, 2013
Elizabeth Munch (Duke)
CompTop-SUNYIT
Tuesday, June 11, 2013
1 / 43
“There is no branch of mathematics, however abstract, which may not
some day be applied to phenomena of the real world.”
-Nikolai Lobachevsky
Elizabeth Munch (Duke)
CompTop-SUNYIT
Tuesday, June 11, 2013
2 / 43
Finding Structure in Data Sets
1.5
1.0
0.5
0.0
0.5
1.0
1.51.5
Elizabeth Munch (Duke)
1.0
0.5
0.0
CompTop-SUNYIT
0.5
1.0
1.5
Tuesday, June 11, 2013
3 / 43
Outline
1
Homology & Persistent Homology
2
Statistics
3
Behavioral Clustering and Future Work
Elizabeth Munch (Duke)
CompTop-SUNYIT
Tuesday, June 11, 2013
4 / 43
Outline
1
Homology & Persistent Homology
2
Statistics
3
Behavioral Clustering and Future Work
Elizabeth Munch (Duke)
CompTop-SUNYIT
Tuesday, June 11, 2013
5 / 43
Homology & Persistent Homology
What is Homology?
A topological invariant which assigns a sequence of vector spaces, Hk (X ),
to a given topological space X .
Elizabeth Munch (Duke)
CompTop-SUNYIT
Tuesday, June 11, 2013
6 / 43
Homology & Persistent Homology
Elizabeth Munch (Duke)
CompTop-SUNYIT
Tuesday, June 11, 2013
6 / 43
Homology & Persistent Homology
Elizabeth Munch (Duke)
CompTop-SUNYIT
Tuesday, June 11, 2013
6 / 43
Homology & Persistent Homology
Elizabeth Munch (Duke)
CompTop-SUNYIT
Tuesday, June 11, 2013
6 / 43
Homology & Persistent Homology
Elizabeth Munch (Duke)
CompTop-SUNYIT
Tuesday, June 11, 2013
6 / 43
Homology & Persistent Homology
Elizabeth Munch (Duke)
CompTop-SUNYIT
Tuesday, June 11, 2013
6 / 43
Homology & Persistent Homology
Elizabeth Munch (Duke)
CompTop-SUNYIT
Tuesday, June 11, 2013
6 / 43
Homology & Persistent Homology
What is Persistent Homology?
A way to watch how the homology of a filtration (sequence) of topological
spaces changes so that we can understand something about the space.
Elizabeth Munch (Duke)
CompTop-SUNYIT
Tuesday, June 11, 2013
7 / 43
Homology & Persistent Homology
What is Persistent Homology?
A way to watch how the homology of a filtration (sequence) of topological
spaces changes so that we can understand something about the space.
Elizabeth Munch (Duke)
CompTop-SUNYIT
Tuesday, June 11, 2013
7 / 43
Homology & Persistent Homology
What is Persistent Homology?
A way to watch how the homology of a filtration (sequence) of topological
spaces changes so that we can understand something about the space.
Elizabeth Munch (Duke)
CompTop-SUNYIT
Tuesday, June 11, 2013
7 / 43
Homology & Persistent Homology
What is Persistent Homology?
A way to watch how the homology of a filtration (sequence) of topological
spaces changes so that we can understand something about the space.
Elizabeth Munch (Duke)
CompTop-SUNYIT
Tuesday, June 11, 2013
7 / 43
Homology & Persistent Homology
What is Persistent Homology?
A way to watch how the homology of a filtration (sequence) of topological
spaces changes so that we can understand something about the space.
Elizabeth Munch (Duke)
CompTop-SUNYIT
Tuesday, June 11, 2013
7 / 43
Homology & Persistent Homology
What is Persistent Homology?
A way to watch how the homology of a filtration (sequence) of topological
spaces changes so that we can understand something about the space.
Elizabeth Munch (Duke)
CompTop-SUNYIT
Tuesday, June 11, 2013
7 / 43
Homology & Persistent Homology
What is Persistent Homology?
A way to watch how the homology of a filtration (sequence) of topological
spaces changes so that we can understand something about the space.
Elizabeth Munch (Duke)
CompTop-SUNYIT
Tuesday, June 11, 2013
7 / 43
Homology & Persistent Homology
What is Persistent Homology?
A way to watch how the homology of a filtration (sequence) of topological
spaces changes so that we can understand something about the space.
Elizabeth Munch (Duke)
CompTop-SUNYIT
Tuesday, June 11, 2013
7 / 43
Filtration
Given topological space K and filtration
K0 ⊆ K1 ⊆ K2 ⊆ · · · ⊆ Kn
gives a sequence of maps on homology
H2 (K0 )
Elizabeth Munch (Duke)
/ H2 (K1 )
/ H2 (K2 )
CompTop-SUNYIT
/ ···
/ H2 (Kn )
Tuesday, June 11, 2013
8 / 43
Birth and Death
Birth
A class γ ∈ Hp (Ki ) is born at
Ki if it is not in the image of
Death
A class γ dies entering Kj if
merges with an older class.
Hp (Ki−1 ) −→ Hp (Ki )
Elizabeth Munch (Duke)
CompTop-SUNYIT
Tuesday, June 11, 2013
9 / 43
Simplicial Complex
Simplices
Elizabeth Munch (Duke)
Simplicial Complex
CompTop-SUNYIT
Tuesday, June 11, 2013
10 / 43
Boundary Matrix Data Structure
Elizabeth Munch (Duke)
CompTop-SUNYIT
Tuesday, June 11, 2013
11 / 43
Large Data Sets
1.5
1.0
0.5
0.0
0.5
1.0
1.51.5
Elizabeth Munch (Duke)
1.0
0.5
0.0
CompTop-SUNYIT
0.5
1.0
1.5
Tuesday, June 11, 2013
12 / 43
Large Data Sets
Elizabeth Munch (Duke)
CompTop-SUNYIT
Tuesday, June 11, 2013
13 / 43
Large Data Sets
Elizabeth Munch (Duke)
CompTop-SUNYIT
Tuesday, June 11, 2013
13 / 43
Large Data Sets
Elizabeth Munch (Duke)
CompTop-SUNYIT
Tuesday, June 11, 2013
13 / 43
Rips complex vs Čech complex
Čech Complex
Rips Complex
Set of points χ ⊂ Rn
Set of points χ ⊂ Rn
Rips complex R
Čech complex C
σ∈R
⇔
kvi − vj k ≤ r for all
vi , vj ∈ σ
σ∈C
⇔
T
Br (Vi ) 6= ∅
Elizabeth Munch (Duke)
CompTop-SUNYIT
Tuesday, June 11, 2013
14 / 43
Persistence Algorithm
1
2
3
4
5
6
7
1 2 3 4 5 6 7
Elizabeth Munch (Duke)
1
2
3
4
5
6
7
1 2 3 4 5 6 7
CompTop-SUNYIT
1
2
3
4
5
6
7
1 2 3 4 5 6 7
Tuesday, June 11, 2013
15 / 43
Persistence Algorithm
R=D
V =I
for j = 1 to m:
while ∃ j0 < j with low (j0 ) = low (j):
add column j0 to column j
1
2
3
4
5
6
7
1 2 3 4 5 6 7
Elizabeth Munch (Duke)
1
2
3
4
5
6
7
1 2 3 4 5 6 7
CompTop-SUNYIT
1
2
3
4
5
6
7
1 2 3 4 5 6 7
Tuesday, June 11, 2013
15 / 43
Visualizing Birth and Death
7
6
5
4
3
2
1
1
2
3
4
5
6
7
1 2 3 4 5 6 7
Elizabeth Munch (Duke)
1 2 3 4 5 6 7
1 2 3 4 5 6 7
CompTop-SUNYIT
Tuesday, June 11, 2013
16 / 43
Large Data Sets
1.5
2.5
1.0
2.0
0.5
1.5
0.0
1.0
0.5
0.5
1.0
1.51.5
1.0
0.5
0.0
Elizabeth Munch (Duke)
0.5
1.0
1.5
0.00.0
CompTop-SUNYIT
0.5
1.0
1.5
2.0
Tuesday, June 11, 2013
2.5
17 / 43
Wasserstein Distance on Dp
c
y
d
z
b
x
a
p-Wasserstein distance for diagrams
Given diagrams X and Y , the distance between them is
!1/p
Wp [Lq ](X , Y ) =
Elizabeth Munch (Duke)
inf
ϕ:X →Y
X
(kx − ϕ(x)kq )
p
.
x∈X
CompTop-SUNYIT
Tuesday, June 11, 2013
18 / 43
Stability
Theorem (Cohen-Steiner, Edelsbrunner, Harer, Mileyko)
Let X be a triangulable metric space that implies bounded degree k total
persistence, for k ≥ 1, and let f , g : X → R be two tame Lipschitz
functions. Then
Wp (f , g ) ≤ C 1/p · kf − g k1−k/p
∞
for all p ≥ k, where C = CX max{Lip(f )k , Lip(g )k }.
Elizabeth Munch (Duke)
CompTop-SUNYIT
Tuesday, June 11, 2013
19 / 43
Properties of Dp
Mileyko, Mukherjee, Harer
Complete
Separable
Characterization of Compact Sets
Turner, Mileyko, Mukherjee, Harer
Non-negatively curved Alexandrov space
Geodesics
Elizabeth Munch (Duke)
CompTop-SUNYIT
Tuesday, June 11, 2013
20 / 43
1
Homology & Persistent Homology
2
Statistics
3
Behavioral Clustering and Future Work
Elizabeth Munch (Duke)
CompTop-SUNYIT
Tuesday, June 11, 2013
21 / 43
Statistics
Elizabeth Munch (Duke)
CompTop-SUNYIT
Tuesday, June 11, 2013
22 / 43
Statistics
Elizabeth Munch (Duke)
CompTop-SUNYIT
Tuesday, June 11, 2013
22 / 43
Statistics
How do we give a summary of the data?
Will it play nicely with time varying persistence diagrams?
Elizabeth Munch (Duke)
CompTop-SUNYIT
Tuesday, June 11, 2013
22 / 43
Fréchet mean
Fréchet mean
Given a probability space (Dp , B(Dp ), P), the quantity
"
Z
VarP = inf
X ∈Dp
#
Wp (X , Y )2 dP(Y ) < ∞
FP (X ) =
Dp
is the Fréchet variance of P.
The set at which the value is obtained
E(P) = {X |FP (X ) = VarP}
is the Fréchet expectation, also called Fréchet mean.
Elizabeth Munch (Duke)
CompTop-SUNYIT
Tuesday, June 11, 2013
23 / 43
Properties of the Fréchet mean on Dp
Theorem (Mileyko et al.)
Let P be a probability measure on (Dp , B(Dp )) with a finite second
moment. If P has compact support, then E(P) 6= ∅.
Theorem (Mileyko et al.)
Let P be a tight probability measure on (Dp , B(Dp )) with the rate of
decay at infinity q > max{2, p}. Then E(P) 6= ∅.
Elizabeth Munch (Duke)
CompTop-SUNYIT
Tuesday, June 11, 2013
24 / 43
Goal
Given diagrams X1 , · · · , XN , find a diagram Y
which minimizes the Fréchet function.
Elizabeth Munch (Duke)
CompTop-SUNYIT
Tuesday, June 11, 2013
25 / 43
Example
f
b
z
c
x
y
g
h
a
Elizabeth Munch (Duke)
CompTop-SUNYIT
Tuesday, June 11, 2013
26 / 43
Example
Elizabeth Munch (Duke)
CompTop-SUNYIT
Tuesday, June 11, 2013
26 / 43
Algorithm for Computation - Outline
Algorithm - Kate Turner et al.
1
Persistence Diagrams
X1 , · · · , XN
2
Pick one at random to start,
Y = Xi
3
Repeat:
I
I
I
z
CompTop-SUNYIT
c
x
y
Find best matching for
Wasserstein distance
Wp (Y , Xi )
Create matching G with a
with a selection for each
point yj ∈ Y with the points
xi ∈ Xi for each i paired
with yj
Replace Y with meanX (G )
Elizabeth Munch (Duke)
f
b
g
h
a
Tuesday, June 11, 2013
27 / 43
Algorithm for Computation - Outline
f
b
z
c
x
y
Theorem (Turner et al.)
The algorithm terminates at a local
minimum of the Fréchet function.
g
h
a
Elizabeth Munch (Duke)
CompTop-SUNYIT
Tuesday, June 11, 2013
27 / 43
The Problem with Pointwise Fréchet means
Elizabeth Munch (Duke)
1
b
a
2
CompTop-SUNYIT
Tuesday, June 11, 2013
28 / 43
The Problem with Pointwise Fréchet means
1
x
u
a
Elizabeth Munch (Duke)
b
v
y
2
CompTop-SUNYIT
Tuesday, June 11, 2013
28 / 43
The Solution
Instead of a single diagram, return a distribution on
diagrams!
Elizabeth Munch (Duke)
CompTop-SUNYIT
Tuesday, June 11, 2013
29 / 43
The Solution
Instead of a single diagram, return a distribution on
diagrams!
Main Idea:
1
Perturb the diagrams and compute the matching.
Elizabeth Munch (Duke)
CompTop-SUNYIT
Tuesday, June 11, 2013
29 / 43
The Solution
Instead of a single diagram, return a distribution on
diagrams!
Main Idea:
1
Perturb the diagrams and compute the matching.
2
Do this repeatedly to get a distribution on matchings.
Elizabeth Munch (Duke)
CompTop-SUNYIT
Tuesday, June 11, 2013
29 / 43
The Solution
Instead of a single diagram, return a distribution on
diagrams!
Main Idea:
1
Perturb the diagrams and compute the matching.
2
Do this repeatedly to get a distribution on matchings.
3
Associate the probability of getting a particular
matching to the mean of the matching.
Elizabeth Munch (Duke)
CompTop-SUNYIT
Tuesday, June 11, 2013
29 / 43
The Solution
Instead of a single diagram, return a distribution on
diagrams!
Main Idea:
1
Perturb the diagrams and compute the matching.
2
Do this repeatedly to get a distribution on matchings.
3
Associate the probability of getting a particular
matching to the mean of the matching.
Limit to SM,K ⊂ Dp : inside triangle of height M with at most K
off-diagonal points.
Elizabeth Munch (Duke)
CompTop-SUNYIT
Tuesday, June 11, 2013
29 / 43
The Drawing Procedure - Make Xi0
Pick α
Pick distribution ηx with mean at x and support contained in Bα (x).
For each x ∈ Xi , make Xi0 by:
1
2
Draw point from ηx
If contained in Bkx−∆k (x), add it to Xi0 .
Elizabeth Munch (Duke)
CompTop-SUNYIT
Tuesday, June 11, 2013
30 / 43
The Drawing Procedure - Determine Matching
f
b
z
c
x
y
g
h
a
Elizabeth Munch (Duke)
CompTop-SUNYIT
Tuesday, June 11, 2013
31 / 43
The Drawing Procedure - Determine Matching
f'
b'
x'
z'
g'
y'
h'
Elizabeth Munch (Duke)
CompTop-SUNYIT
Tuesday, June 11, 2013
31 / 43
The Drawing Procedure - Determine Matching
f'
b'
0
dF
1 b0
2
∆
3 ∆
4 ∆

x'
z'
d0
x0
y0
∆
z0
0
d

f0
g0 
.
h0 
∆
g'
y'
h'
Elizabeth Munch (Duke)
CompTop-SUNYIT
Tuesday, June 11, 2013
31 / 43
The Drawing Procedure - Determine Matching
f
b
z
c
x
y
d0
x0
y0
∆
z0
0
d

f0
g0 
.
h0 
∆
dF
1
b
2
∆

3
∆
4
∆
5 a
6
c
d
x
y
∆
z
∆
∆
d

f
g 

h 
.
∆ 

∆ 
g
h

a
Elizabeth Munch (Duke)
0
dF
1 b0
2
∆
3 ∆
4 ∆

CompTop-SUNYIT
Tuesday, June 11, 2013
∆
31 / 43
The Drawing Procedure - Create Distribution
1
b
a
2
Definition
µX =
X
P(G ) δmeanX (G )
G
Elizabeth Munch (Duke)
CompTop-SUNYIT
Tuesday, June 11, 2013
32 / 43
The Drawing Procedure - Create Distribution
1
x
u
a
Definition
µX =
X
b
v
y
2
P(G ) δmeanX (G )
G
Elizabeth Munch (Duke)
CompTop-SUNYIT
Tuesday, June 11, 2013
32 / 43
Continuity
Theorem (Munch et al)
The map
SM,K × · · · × SM,K
X1 , · · · , XN
−→
7−→
P(SM,K )
µX
is continuous.
Elizabeth Munch (Duke)
CompTop-SUNYIT
Tuesday, June 11, 2013
33 / 43
Examples
Elizabeth Munch (Duke)
CompTop-SUNYIT
Tuesday, June 11, 2013
34 / 43
1
Homology & Persistent Homology
2
Statistics
3
Behavioral Clustering and Future Work
Elizabeth Munch (Duke)
CompTop-SUNYIT
Tuesday, June 11, 2013
35 / 43
How do we automate behavioral analysis?
Image: Wikipedia
Elizabeth Munch (Duke)
CompTop-SUNYIT
Tuesday, June 11, 2013
36 / 43
Behavior Vectors
Main Idea:
Quantify a particular behavior as a
vector in RD and cluster these
points.
Elizabeth Munch (Duke)
CompTop-SUNYIT
Tuesday, June 11, 2013
37 / 43
Behavior Vectors
Main Idea:
Example:
Quantify a particular behavior as a
vector in RD and cluster these
points.
Loop time vector: Put an entry into
a list for each loop in a track, sort,
and compare.
Elizabeth Munch (Duke)
CompTop-SUNYIT
Tuesday, June 11, 2013
37 / 43
Behavior Vectors
Main Idea:
Example:
Quantify a particular behavior as a
vector in RD and cluster these
points.
Loop time vector: Put an entry into
a list for each loop in a track, sort,
and compare.
Black: (6.52, 0, 0, 0)
Purple: (6.63, 0, 0, 0)
Orange: (5.01, 4.85, 0, 0)
Red: (8.20, 6.46, 0, 0)
Blue: (21.17, 17.60, 13.26, 7.54)
Green: (15.40, 11.78, 8.28, 7.12)
Elizabeth Munch (Duke)
CompTop-SUNYIT
Tuesday, June 11, 2013
37 / 43
Dendrograms
Elizabeth Munch (Duke)
CompTop-SUNYIT
Tuesday, June 11, 2013
38 / 43
Experimental Results
Experiment
Tracks generated using
open-source SUMO software.
Additional tracks forced to make
small circles, larger circles, and
figure eights.
Loops computed by:
1
2
Elizabeth Munch (Duke)
checking for crossings, looking
at time needed to complete
loop.
computing persistence, using
lifetime of class as score.
CompTop-SUNYIT
Tuesday, June 11, 2013
39 / 43
Experimental Results - NonPersistence
Random
Small Loop
Large Loop
Figure 8
Elizabeth Munch (Duke)
CompTop-SUNYIT
Tuesday, June 11, 2013
40 / 43
Experimental Results - Persistence
Random
Small Loop
Large Loop
Figure 8
Elizabeth Munch (Duke)
CompTop-SUNYIT
Tuesday, June 11, 2013
41 / 43
Group Formation
Image: Wikipedia
Elizabeth Munch (Duke)
CompTop-SUNYIT
Tuesday, June 11, 2013
42 / 43
Acknowledgements
Funding:
DDR&E
Honors:
Jo Rae Wright Fellowship
for Outstanding Women
in Science, 2012-2013
Elizabeth Munch (Duke)
Collaborators:
John Harer
Paul Bendich
Kate Turner (UChicago)
Sayan Mukherjee
Jonathan Mattingly
Ingrid Daubachies
Robert Calderbank
Peter Sang Chin (APL/JHU)
CompTop-SUNYIT
Tuesday, June 11, 2013
43 / 43
Related documents