Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
SAMSI
UNC, Stat & OR
OODA of Tree Structured Objects:
Folded Euclidean Approach
S. Skwerer, J. S. Marron, S. Provan
Dept. of Statistics and Operations Research
May 25, 2017
1
Outline
UNC, Stat & OR
Motivation
Modeling and Analysis
Phylogenetics
Anatomy
n-trees and Folded Euclidean tree space
Averaging Trees
Smoothing brain artery system data
Open Problem
2
Biology Application:
Phylogentic Trees
UNC, Stat & OR
FISH GENUS
7
5
FISH GENUS
6
6
phylogenetic history of species
edge lengths represent genetic change
3
UNC, Stat & OR
Medical Application:
Blood Vessel Tree Data
Motivating Example:
From Dr. Elizabeth Bullitt
•
•
•
Dept. of Neurosurgery, UNC
Blood Vessel Trees in Brains
5
Outline
UNC, Stat & OR
Motivation
Modeling and Analysis
Phylogenetics
Anatomy
n-trees and Folded Euclidean tree space
Averaging Trees
Smoothing brain artery system data
Open Problem
6
Graph
UNC, Stat & OR
7
Tree
UNC, Stat & OR
8
Labeled n-Trees
UNC, Stat & OR
leaves - fixed a priori
- labels {0,1, . . . ,n}
Internal (nonleaf) vertices
degree ≥ 3
edge e has nonneg. length
5-tree
root
|e1|=3
|e2|=4
|e3|=6
9
Same Tree
UNC, Stat & OR
=
10
Different Trees
UNC, Stat & OR
=
11
Valid Tree
UNC, Stat & OR
12
Folded Euclidean Tree Space
UNC, Stat & OR
Introduced by Billera, Holmes, Vogtmann 2001
Geometry of the Space of Phylogenetic trees
Points are n-trees
path = deformation between trees
global Nonpositive curvature space
13
Positive Orthants in R3
UNC, Stat & OR
(pic here)
Non-neg. x y z orthant
14
Folded Euclidean Tree Space
UNC, Stat & OR
𝑝𝑟𝑒𝑠𝑒𝑛𝑡
𝑖𝑓
𝐸𝑑𝑔𝑒
𝑎𝑏𝑠𝑒𝑛𝑡 𝑖𝑓
𝑒𝑖 > 0
𝑒𝑖 = 0
16
Folded Euclidean Tree Space
UNC, Stat & OR
𝑝𝑟𝑒𝑠𝑒𝑛𝑡
𝑖𝑓
𝐸𝑑𝑔𝑒
𝑎𝑏𝑠𝑒𝑛𝑡 𝑖𝑓
𝑒𝑖 > 0
𝑒𝑖 = 0
17
Folded Euclidean Tree Space
UNC, Stat & OR
Geodesic = Shortest Path between 2 points
18
Folded Euclidean Tree Space
UNC, Stat & OR
Global NPC -> Geodesic is unique
19
Folded Euclidean Tree Space
UNC, Stat & OR
Nonpositive curvature -> triangles skinner
20
Folded Euclidean Tree Space
UNC, Stat & OR
Only trees with {0,1,2,3,4} leafs
23
Outline
UNC, Stat & OR
Motivation
Modeling and Analysis
Phylogenetics
Anatomy
n-trees and Folded Euclidean tree space
Averaging Trees
Open Problem
25
Frechet Mean
UNC, Stat & OR
𝑥 = 𝑎𝑟𝑔𝑚𝑖𝑛
𝜃
𝑤𝑖 𝑑(𝜃, 𝑥𝑖 )2
𝑖
𝑑 𝑥, 𝑦 = 𝑔𝑒𝑜𝑑𝑒𝑠𝑖𝑐 𝑑𝑖𝑠𝑡𝑎𝑛𝑐𝑒
𝑤𝑖 = 𝑤𝑒𝑖𝑔ℎ𝑡 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛 𝑖
26
Frechet Mean
UNC, Stat & OR
𝑥 = 𝑎𝑟𝑔𝑚𝑖𝑛
𝜃
𝑤𝑖 𝑑(𝜃, 𝑥𝑖 )2
𝑖
Folded Euclidean tree space
Nonpositive curvature space
-> unique mean
27
Biology Application:
Phylogentic Trees
UNC, Stat & OR
FISH GENUS
7
5
FISH GENUS
6
6
phylogenetic history of species
edge lengths represent genetic change
28
Labeled Trees
UNC, Stat & OR
n-tree
5-tree
leaves - fixed a priori
- labels {0,1, . . . ,n}
Internal (nonleaf) vertices
degree ≥ 3
edge e has length |e|
root
|e1|=3
|e2|=4
|e3|=10
29
Consensus Trees
UNC, Stat & OR
Frechet mean for set of phylogenetic trees
30
UNC, Stat & OR
Medical Application:
Blood Vessel Tree Data
31
Cerebral Cortex/Landmarks
UNC, Stat & OR
Landmarks -locations on the cerebral cortex
-comparable across data set
32
Landmark/Vessel Tree
UNC, Stat & OR
33
Labeled Trees
UNC, Stat & OR
n-tree
5-tree
leaves - fixed a priori
- labels {0,1, . . . ,n}
Internal (nonleaf) vertices
degree ≥ 3
edge e has length |e|
root
|e1|=3
|e2|=4
|e3|=10
34
Frechet Mean
UNC, Stat & OR
𝑥 = 𝑎𝑟𝑔𝑚𝑖𝑛
𝜃
𝑤𝑖 𝑑(𝜃, 𝑥𝑖 )2
𝑖
𝑑 𝑥, 𝑦 = 𝑔𝑒𝑜𝑑𝑒𝑠𝑖𝑐 𝑑𝑖𝑠𝑡𝑎𝑛𝑐𝑒
𝑤𝑖 = 𝑤𝑒𝑖𝑔ℎ𝑡 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛 𝑖
35
Kernel Smoothing
‘running weighted means’
UNC, Stat & OR
AGE
,
, ... ,
36
Kernel Smoothing
‘running weighted means’
UNC, Stat & OR
AGE
,
, ... ,
37
Kernel Smoothing
‘running weighted means’
UNC, Stat & OR
AGE
,
, ... ,
38
Kernel Smoothing
‘running weighted means’
UNC, Stat & OR
AGE
,
, ... ,
39
Kernel Smoothing
‘running weighted means’
UNC, Stat & OR
AGE
,
, ... ,
40
Kernel Smoothing
‘running weighted means’
UNC, Stat & OR
AGE
,
, ... ,
41
Kernel Smoothing
‘running weighted means’
UNC, Stat & OR
AGE
,
, ... ,
42
Open Problem
UNC, Stat & OR
Principal component analysis in Folded Euclidean
tree space.
Approach: formulate appropriate optimization problem
Not clear what ‘component’ should be
43
𝑞𝑢𝑒𝑠𝑡𝑖𝑜𝑛𝑠
lim
𝑠𝑙𝑖𝑑𝑒𝑠 →𝑒𝑛𝑑
𝑒 −𝑖𝜔𝑡 𝑑𝑡 +
𝑝𝑟𝑒𝑠𝑒𝑛𝑡𝑎𝑡𝑖𝑜𝑛𝑠𝑙𝑖𝑑𝑒𝑠 =
𝑐𝑜𝑚𝑚𝑒𝑛𝑡𝑠
𝑟𝑒𝑙𝑎𝑡𝑒𝑑 𝑝𝑎𝑝𝑒𝑟𝑠
𝑎𝑢𝑡ℎ𝑜𝑟𝑠
UNC, Stat & OR
Billera, Holmes, Vogtmann
Geometry of the Space of Phylogenetic Trees
Owen, Provan
A Fast Algorithm for Computing Geodesic Distances in Tree Space
Sturm
Probability Measures on Metric Spaces of Nonpositive Curvature
44
PEOPLE
UNC, Stat & OR
• J.S. Marron
• Scott Provan
• Sean Skwerer
• Megan Owen
• Ezra Miller
• Martin Styner
• Ipek Oguz
45