Applied Topology
Instructor: Sara Kališnik Verovšek
Office Hours: Room 304, Tu/Th 4-5 pm or by appointment.
We will not follow a single textbook for the entire course.
For point-set topology, see Notes on Introductory Point-Set Topology by A.
For algebraic topology, see Algebraic Topology by A. Hatcher.
For algebra, see Abstract Algebra by Dummitt and Foote.
For various topics from applied topology, see either H. Edelsbrunner & J.
Harer’s Computational Topology or Robert Ghrist’s Elementary Applied
Topology. More than these, we will use Gunnar Carlsson’s writeup, Topological
Pattern Recognition for Point Cloud Data.
I will post materials on the course website:
Your course grade will be based on:
• Problem sets assigned every other week (50%);
• Final Project (40%);
• Class Participation (10%).
Homeworks are due at the beginning of class. Late homeworks will not be
accepted without an official note. You are expected to hand in your own
write-up of each homework assignment, even if you worked with others.
Course Schedule (topics subject to change)
Week of 9/5: Introductory lecture.
Week of 9/12: Homeomorphisms, closed and open sets in Rn , compactness,
metric spaces.
Week of 9/19: Simplicial complexes. Problem set 1 due 9/22.
Week of 9/26: Groups, rings. Homotopy groups, Homology groups.
Week of 10/3: Vietoris-Rips, Čech, witness, and α-complexes, persistence
vector spaces. Problem set 2 due 10/6.
Week of 10/10: Classification of persistence vector spaces, algorithm to
compute barcodes and persistence diagrams.
Week of 10/17: Examples: Image processing, neuroscience, viral evolution.
Final project proposal due by the end of the week. Problem set 3 due 10/20.
Week of 10/24: Stability theorems, metrics on barcode spaces, coordinates on
barcode spaces.
Week of 11/1: Zigzag persistence. Problem set 4 due 11/3.
Week of 11/7: Sensor networks and levelset zigzag persistence.
Week of 11/14: Multidimensional persistence. Problem set 5 due 11/17.
Week of 11/21: Mapping methods, connection with machine learning. Final
project presentation draft is due on Tuesday, 11/22.
Week of 11/28, 12/5: Presentations of final projects. Final project due 12/7 at
This is largely a survey talk inspired by Introductory Lecture for Math 149
taught @Stanford in 2014 and the following survey papers:
• Gunnar Carlsson, Topology and Data, 2008
• Robert Ghrist, Barcodes: The Persistent Topology of Data, 2008
• Gunnar Carlsson, Topological Pattern Recognition for Point Cloud Data,
Motivation: Data analysis
An important feature of modern science and engineering is that data of various
kinds is being produced at an unprecedented rate (Gene expression data,
Twitter’s/Facebook’s ‘social graph’).
It is often given in the form of point clouds in Rn .
We have problems analyzing this data because it is often
• given in the form of very long vectors, where not all coordinates are
• very high-dimensional,
• noisy.
Goal of topological data analysis:
Leverage machinery of algebraic topology to develop tools for
studying ‘qualitative’ features of data.
Shape of Data
Linear Regression
Breast Cancer Study [Nicolau, Levine, Carlsson 2011]
Pure branch of mathematics that dates back to 1700’.
Euler in Konigsberg
Konigsberg was a city in Prussia situated on the Pregel river (modern day
Kaliningrad, a major industrial center of western Russia). Seven bridges
spanned the various branches of the river as depicted in the picture.
Is possible to cross all seven bridges exactly once and return to a starting point
in a single stroll?
What is topology?
Why Topology?
Three key ideas:
• Invariance under deformation
• Coordinate freeness
• Compressed representations
How to deal with shape?
Two tasks:
• Measure Shape
• Represent Shape
Persistent Homology
Homology is a formalism for measuring shape...
b1 = 1
b2 = 0
b1 =?
b2 =?
b1 =?
b2 =?
bi is the i-th Betti number and it counts the number of ‘i-dimensional holes.’
b1 = 1
b2 = 0
b1 = 0
b2 = 1
b1 = 2
b2 = 1
b1 = 1
b2 = 0
b1 = 0
b2 = 1
b1 = 2
b2 = 1
The extension of homology to more general setting including point clouds
is called persistent homology.
The concept emerged independently in the work of Frosini, Ferri, and
collaborators in Bologna, Italy, of Robins at Boulder, Colorado, and of
Edelsbrunner, Letscher and Zomorodian at Duke, North Carolina.
A finite metric space X has no interesting topology.
A finite metric space X has no interesting topology.
Naive Idea
Let U(X, R) be the union of balls of radius R centered at the points of X. For
any R > 0 and i ≥ 0, i-th Betti number of U(X, R) gives us a qualitative
descriptor of X.
b0 = 1
b1 = 2
b0 = 1
b1 = 1
Problems with this descriptor
• No canonical choice of R.
• Invariant is unstable with respect to perturbation of data or small changes
in R.
• Does not distinguish ‘small’ holes from ‘big’ ones.
• Consider not only single reconstruction U(X, R) of X, but a 1-parameter
family of reconstructions
F (X) = {U(X, r )}r ∈[0,∞)
and inclusion maps U(X, r ) ,→ U(X, r 0 ) whenever r ≤ r 0 .
• Consider not only single reconstruction U(X, R) of X, but a 1-parameter
family of reconstructions
F (X) = {U(X, r )}r ∈[0,∞)
and inclusion maps U(X, r ) ,→ U(X, r 0 ) whenever r ≤ r 0 .
• Apply i-dimensional homology functor Hi with field coefficients
• Consider not only single reconstruction U(X, R) of X, but a 1-parameter
family of reconstructions
F (X) = {U(X, r )}r ∈[0,∞)
and inclusion maps U(X, r ) ,→ U(X, r 0 ) whenever r ≤ r 0 .
• Apply i-dimensional homology functor Hi with field coefficients
• Obtain a family of vector spaces {Vr }r and linear maps between them.
Call such algebraic structures persistence vector spaces.
• Consider not only single reconstruction U(X, R) of X, but a 1-parameter
family of reconstructions
F (X) = {U(X, r )}r ∈[0,∞)
and inclusion maps U(X, r ) ,→ U(X, r 0 ) whenever r ≤ r 0 .
• Apply i-dimensional homology functor Hi with field coefficients
• Obtain a family of vector spaces {Vr }r and linear maps between them.
Call such algebraic structures persistence vector spaces.
Can we classify persistence vector spaces that arise from filtrations up to
• Consider not only single reconstruction U(X, R) of X, but a 1-parameter
family of reconstructions
F (X) = {U(X, r )}r ∈[0,∞)
and inclusion maps U(X, r ) ,→ U(X, r 0 ) whenever r ≤ r 0 .
• Apply i-dimensional homology functor Hi with field coefficients
• Obtain a family of vector spaces {Vr }r and linear maps between them.
Call such algebraic structures persistence vector spaces.
Can we classify persistence vector spaces that arise from filtrations up to
Yes, by barcodes.
(Computing Persistent Homology, Gunnar Carlsson and Afra J. Zomorodian)
Barcode for H1 :
Barcode for H1 :
For each interval:
• Left endpoint is the index at which the hole is born
• Right endpoint is index at which hole dies
• Length of interval is the lifetime of a hole in filtration
Applications of Persistent Homology
Natural Scene Statistics/Image Processing
(Local structure of spaces of natural images by G. Carlsson, Vin de Silva, T.
Ishkanov and A. Zomorodian)
Natural Scene Statistics/Image Processing
A long time ago in a country far far away (the Netherlands) J. van Hateren and
A. van der Schaaf were taking photos in a town called Groningen and in the
surrounding countryside.
An image taken by black and white digital camera can be viewed as a vector,
with one coordinate for each pixel
Natural Scene Statistics/Image Processing
An image taken by black and white digital camera can be viewed as a vector,
with one coordinate for each pixel
Typical camera uses tens of thousands of pixels, so images lie in a very high
dimensional pixel space, RP .
An image taken by black and white digital camera can be viewed as a vector,
with one coordinate for each pixel
Typical camera uses tens of thousands of pixels, so images lie in a very high
dimensional pixel space, RP .
David Mumford: What can be said about the set of images I ⊆ P lying within
RP ? Can it be modeled as a submanifold or a subspace of RP ?
Natural Scene Statistics/Image Processing
David Mumford gave a great deal of thought to questions such as this one
concerning natural image statistics, and he came to the conclusion that
although the above argument indicates that the whole manifold of images is
not accessible in a useful way, a space of small image patches might in fact
contain quite useful information.
David Mumford gave a great deal of thought to questions such as this one
concerning natural image statistics, and he came to the conclusion that
although the above argument indicates that the whole manifold of images is
not accessible in a useful way, a space of small image patches might in fact
contain quite useful information.
Solution: observe 3 × 3 patches.
Natural Scene Statistics/Image Processing
Pre-processing the Dataset:
A preliminary observation is that patches which are constant, or rather nearly
constant, will predominate among these patches.
Pre-processing the Dataset:
A preliminary observation is that patches which are constant, or rather nearly
constant, will predominate among these patches.
These do not carry interesting structure, so Lee, Mumford, and Pedersen focus
on high contrast patches. They
Mean center the data. This means that if a patch is obtained from another
patch by adding a constant value, i.e. ‘turning up the
brightness knob’, then the two patches will be regarded as the
Normalize the D-norm. This means that if one patch is obtained from another
by ‘turning the contrast knob’, then the two patches will be
regarded as identical.
Pre-processing the Dataset:
A preliminary observation is that patches which are constant, or rather nearly
constant, will predominate among these patches.
These do not carry interesting structure, so Lee, Mumford, and Pedersen focus
on high contrast patches. They
Mean center the data. This means that if a patch is obtained from another
patch by adding a constant value, i.e. ‘turning up the
brightness knob’, then the two patches will be regarded as the
Normalize the D-norm. This means that if one patch is obtained from another
by ‘turning the contrast knob’, then the two patches will be
regarded as identical.
The result of this construction is a database M of ca. 4.5 × 106 points on a
7-sphere in R8 .
to obtain some understanding of how this set sits within S 7 .
to obtain some understanding of how this set sits within S 7 .
(large k corresponds to a smoothed out notion of density, and for small k
corresponds to a version which carries more of the detailed structure of the
data set.)
Explanation: the most high density patches consist of the discrete versions of
linear functions in two variables.
Natural Scene Statistics/Image Processing
(large k corresponds to a smoothed out notion of density, and for small k
corresponds to a version which carries more of the detailed structure of the
data set.)
Is there a larger 2-dimensional space containing the three circle model,
occurring with substantial density?
Is there a larger 2-dimensional space containing the three circle model,
occurring with substantial density?
Klein Bottle
J. Perea, G. Carlsson: Compression based on the Klein bottle mode (Kleinlets).
Baraniuk, Donoho, et al. did compression based on the primary circle
Applications of Persistent Homology
Tree of Life
• 1970s molecular phylogenetic analysis based on nucleotide and protein
• 1970s molecular phylogenetic analysis based on nucleotide and protein
• 1977 Carl Woese identifies archaea as new domain in life
• 1970s molecular phylogenetic analysis based on nucleotide and protein
• 1977 Carl Woese identifies archaea as new domain in life
• since 1990s a true revolution in genomic sequencing techniques providing
hard data for evolutionary biology
How to find out what a relationship between the genomes is?
Viral Evolution (Topology of viral evolution by J.M. Chan, G. Carlsson, and R.
Representing Shape
Can one extend topological mapping methods (compressed representations)
from idealized shapes to data?
Can one extend topological mapping methods (compressed representations)
from idealized shapes to data?
Yes. The resulting method is called mapper and was developed by G. Singh, F.
Memoli and G. Carlsson.
Different ways in which we can approach this problem:
• Projection pursuit method determines the linear projection on two or three
dimensional space which optimizes a certain heuristic criterion. It is
frequently very successful, and when it succeeds it produces a set in R2 or
R3 .
Different ways in which we can approach this problem:
• Projection pursuit method determines the linear projection on two or three
dimensional space which optimizes a certain heuristic criterion. It is
frequently very successful, and when it succeeds it produces a set in R2 or
R3 .
• Multidimensional scaling begins from an arbitrary point cloud and
attempts to embed it isometrically in Euclidean space of various
dimensions with minimum distortion of the metric. Related to this is are
Isomap, locally linear embedding.
Different ways in which we can approach this problem:
• Projection pursuit method determines the linear projection on two or three
dimensional space which optimizes a certain heuristic criterion. It is
frequently very successful, and when it succeeds it produces a set in R2 or
R3 .
• Multidimensional scaling begins from an arbitrary point cloud and
attempts to embed it isometrically in Euclidean space of various
dimensions with minimum distortion of the metric. Related to this is are
Isomap, locally linear embedding.
In all cases, the methodologies result in a point cloud in R2 or R3 , which can
be visualized by the investigator.
Suppose we have a covering of a circle:
We assign a vertex to each connected component of this covering
When precisely two connected components intersect, we connect the
corresponding vertices with an edge.
Representing Shape
When precisely two connected components intersect, we connect the
corresponding vertices with an edge.
When more than two, add a face of appropriate dimension.
Topological version of Mapper
We are given a space X equipped with a continuous map f : X → Z to a
parameter space Z , and that the space Z is equipped with a covering
U = {Uα }α∈A for some finite indexing set A.
Topological version of Mapper
We are given a space X equipped with a continuous map f : X → Z to a
parameter space Z , and that the space Z is equipped with a covering
U = {Uα }α∈A for some finite indexing set A.
• Since f is continuous, the sets f −1 (Uα ) form an open covering of X .
Topological version of Mapper
We are given a space X equipped with a continuous map f : X → Z to a
parameter space Z , and that the space Z is equipped with a covering
U = {Uα }α∈A for some finite indexing set A.
• Since f is continuous, the sets f −1 (Uα ) form an open covering of X .
• We write f −1 (Uα ) = ∪jj=1
V (α, i) where jα is the number of connected
components of f −1 (Uα ). We write U for the covering of X obtained by
taking these connected components.
Topological version of Mapper
We are given a space X equipped with a continuous map f : X → Z to a
parameter space Z , and that the space Z is equipped with a covering
U = {Uα }α∈A for some finite indexing set A.
• Since f is continuous, the sets f −1 (Uα ) form an open covering of X .
• We write f −1 (Uα ) = ∪jj=1
V (α, i) where jα is the number of connected
components of f −1 (Uα ). We write U for the covering of X obtained by
taking these connected components.
• Represent the topological space by a nerve of U.
The Statistical version of Mapper
• Define a reference map f : X → Z , where X is the given a point cloud and
Z is the reference metric space.
The Statistical version of Mapper
• Define a reference map f : X → Z , where X is the given a point cloud and
Z is the reference metric space.
• Select a covering U of Z .
The Statistical version of Mapper
• Define a reference map f : X → Z , where X is the given a point cloud and
Z is the reference metric space.
• Select a covering U of Z .
• If U = {Uα }α∈A , then construct the subsets Xα = f −1 (Uα ).
The Statistical version of Mapper
• Define a reference map f : X → Z , where X is the given a point cloud and
Z is the reference metric space.
• Select a covering U of Z .
• If U = {Uα }α∈A , then construct the subsets Xα = f −1 (Uα ).
• The analog of taking connected components in the point cloud world is
clustering. Clusters form a covering of X parametrized by pairs (α, c),
where α ∈ A and c is one of the clusters of Xα .
The Statistical version of Mapper
• Define a reference map f : X → Z , where X is the given a point cloud and
Z is the reference metric space.
• Select a covering U of Z .
• If U = {Uα }α∈A , then construct the subsets Xα = f −1 (Uα ).
• The analog of taking connected components in the point cloud world is
clustering. Clusters form a covering of X parametrized by pairs (α, c),
where α ∈ A and c is one of the clusters of Xα .
• Construct a graph whose vertex set is the set of all possible such pairs
(α, c), and where an edge connects (α1 , c1 ) and (α2 , c2 ) if and only if the
corresponding clusters have a point in common.
The Statistical version of Mapper
Consider point cloud data which is
sampled from a noisy circle in R2 ,
and the filter f (x) = ||x − p||2 ,
where p is the left most point in the
Vertices are colored by the average
filter value.
The outcome of Mapper is highly dependent on the function or functions
chosen to partition the data set.
The outcome of Mapper is highly dependent on the function or functions
chosen to partition the data set. Here are some important examples:
• Density
Consider any density estimator applied a point cloud X . It will produce a
non-negative function on X , which reflects useful information about the
data set. Often, it is exactly the nature of this function which is of interest.
chosen to partition the data set. Here are some important examples:
• Density
Consider any density estimator applied a point cloud X . It will produce a
non-negative function on X , which reflects useful information about the
data set. Often, it is exactly the nature of this function which is of interest.
• Eccentricity
The basic idea is to identify points which are, in an intuitive sense, far
from the center, without actually identifying an actual center point. Given
p with 1 ≤ p < ∞, we set
y ∈X d(x, y )
Ep (x) = (
where x, y ∈ X . This function tends to take larger values on points which
are far removed from a ’center’.
The Miller-Reaven diabetes study
G.M. Reaven and R.G. Miller conducted a diabetes study at Stanford in the
G.M. Reaven and R.G. Miller conducted a diabetes study at Stanford in the
145 patients were included and six quantities were measured: age, relative
weight, fasting plasma glucose, area under the plasma glucose curve for the three hour
glucose tolerance test(OGTT), area under the plasma insulin curve for OGTT, steady
state plasma glucose response.
G.M. Reaven and R.G. Miller conducted a diabetes study at Stanford in the
145 patients were included and six quantities were measured: age, relative
weight, fasting plasma glucose, area under the plasma glucose curve for the three hour
glucose tolerance test(OGTT), area under the plasma insulin curve for OGTT, steady
state plasma glucose response.
If we take the filter to be a density estimator, we get the following
representations for two different resolutions:
Red is indicative of high density, and blue of low. The size of the node and the
number indicate the size of the cluster.
If we take the filter to be a density estimator, we get the following
representations for two different resolutions:
Red is indicative of high density, and blue of low. The size of the node and the
number indicate the size of the cluster.
Breast cancer data
What should the filter be?
What should the filter be?
• Take linear combinations of normal expression data and denote the
subspace they span by N.
What should the filter be?
• Take linear combinations of normal expression data and denote the
subspace they span by N.
~ into normal-like expression, Nc.T
~ ,
• Decompose the original data - vector T
which is the projection onto N.
What should the filter be?
• Take linear combinations of normal expression data and denote the
subspace they span by N.
~ into normal-like expression, Nc.T
~ ,
• Decompose the original data - vector T
which is the projection onto N.
~ from normal-like expression, is defined to be
• The disease, deviation Dc.T
the difference between diseased tissue expression and normal-like
The family of functions we take as filters is
~) = [
fp,k (V
|gr |p ] p
~ = hg1 , g2 , . . . , gs i and coordinates gi are individual genes.
where V
If k = 1, p = 2, the function computes standard (Euclidean) norm of a vector.
Essentially, all these different filter functions, fp,k , measure the overall amount
of deviation from the normal state.
The effect of the different choices of p determining the choice of Lp norm is
that, for larger values of p the weight of genes with larger expression levels is
Both ER+ tumors (Estrogen Receptor positive) showed a 100% survival rate,
with no recurrence or death from the disease.
Clustering versus Mapper
Clustering versus Mapper