Download Bio 208 Morphometric Analysis HS11

Document related concepts
no text concepts found
Transcript
Bio 208 Morphometric Analysis HS11 Ch. Zollikofer M. Ponce de León aims •  variability of complex organismic forms –  quanIfy: 2D,3D,4D data –  analyze –  visualize –  interpret Homo sapiens
Pan troglodytes
Gorilla gorilla
course schedule december 2011 01 – 08
theory
and
practical
exercises
08 – 09
project planning
09 – 16
20 – 21
own projects
pre-
paration
of
pre-
sentation
22
final
presentation
HW / SW / data / exercises •  hardware: – 
– 
– 
– 
Macs and PCs 2D digiIzer (graphics tablet) 3D digiIzer 3D laser and light scanners •  so)ware: –  Mac: JMP, Parallels Desktop (for Windows on Mac) –  Windows: JMP, morphometrics soWware, 3D digiIzing soWware •  web sources: –  hXp://www.aim.uzh.ch/morpho/wiki/Teaching/MorphoAnalyse –  pwd: bio208 data acquisiIon, measuring, and staIsIcs the logics of statistics
paXern versus process pattern(s)
of variability:
•  species morphotype
process II: sexual differentiation
•  sexual dimorphism
•  allometry
process III:
growth
process I: speciation
the logics of statistics
paXern versus process statistical models/hypotheses
empirical statistics
inference
process
pattern
not observable!
biological hypotheses
observations
measurements
(morphometry)
the logics of statistics
the logics of staIsIcal analyses in morphometrics processes
patterns
measurements:
pattern info
biological and statistical hypotheses
variables, spaces, distributions
staIsIcal spaces example of a discontinuous random variable: face nr. of a dice d
•  the space of all possible outcomes (d): 1, 2, 3, 4, 5, 6 •  the space of all possible events: examples of events: –  d > 3: {4,5,6} –  d is an even number: {2,4,6} variables, spaces, distributions
staIsIcal spaces in morphometry example of a continuous random variable: body mass w
•  the space of all possible outcomes: 0 < w < ∞ •  the space of all possible events: examples of events: –  w > 1500g –  1500g ≤ w ≤ 2000g F. Botero
variables, spaces, distributions
empirical example •  measurement variable: x •  number of observaIons: N •  x1, x2, ..., xi, ..., xN x
...
i
variables, spaces, distributions
distribuIons funcIons discrete absolute frequency distribution
cumulative a.f.d.
n
i
N
x
x
variables, spaces, distributions
normalizaIon discrete relative frequency distribution
discrete cumulative rel.frequ.distrib.
n/N
i/N
1
x
x
variables, spaces, distributions
from empirical data to model data: N → ∞ probability density
distribution
integral of prob. dens. distrib.
i/N
ρ
1
x
x
statistical models I
basic staIsIcal models I: normal (random) distribuIon x = µ +"
an empirical measurement
“true” value
(mean)
random deviation
(error term)
statistical models I
normal (Gaussian) distribuIon model probability density (ρ) distribution (= density function)
of a Gaussian random variable x
ρ
"
!
-∞
!
µ
1
"(x) =
e
# 2$
%( µ %x )
2#
2
2
x
+∞
statistical models I
normal distribuIon probability p of x1 ≥ x ≥ x2:
Integral ρ(x) from x1 to x2
ρ
+$
x = µ +"
"
% "(x) dx
#$
!
!
x1
x2
µ
!
x
statistical models I
normal distribuIon: model parameters and empirical esImates model:
expectation
empiry:
observation (measurement)
of a sample
1 N
1 N
x = " xi = "(xi # 0)1 arithm. mean
(1. moment)
N i =1
N i =1
µ: mean
parameter estimates
!
σ2:
N
1
s2 =
(xi " x )2
#
N " 1 i =1
variance
σ: standard deviation
!
s
2. moment
statistical models II
basic staIsIcal models II: analysis of variance measuring random variable xij for individual i and process j:
x ij = µ + " j + #ij
overall
effects of mean
process(es) j
value
error
term
Null hypothesis: τ = 0
test: probability that τ ≠0?
!
statistical models II
staIsIcal tesIng measure variability with sums of squares (SS):
SSerror
SSexplained
grand mean
group means
A
B
C
groups
statistical models II
staIsIcal tesIng estimate parameters from empirical data xij
N
grand mean
1
x = " x ij
N i, j
ni
group means
1
x i = " x ij
n i j=1
i: group count
j: within-group indiv. count
statistical models II
staIsIcal tesIng measure variability with sums of squares (SS):
SStotal = SSexplained + SSerror
SStotal = # ( x ij " x )
2
deviation of measurements from grand mean
ij
SSexplained = # n i ( x i " x )
2
i
SSerror
$ ni
'
2
= #&&# ( x ij " x i ) ))
i % j=1
(
deviation of group means from grand mean
deviation of measurements
from group means
statistical models II
staIsIcal tesIng F-test: ratio between mean squareexplained and mean squareerror
Fcrit
MSexplained SSexplained /DFexplained
=
=
MSerror
SSerror /DFerror
DFexplained = k "1
DFerror = N " k
DFtotal = N "1
!
Fcrit: critical value for null hypothesis
(no model effect, only random fluctuations)
DF: degrees of freedom
statistical models III
basic staIsIcal models III: bivariate analysis measuring random variables x and y for individual i
y = f(x) + error term
e.g.:
yi = α + β xi + εi
prob. that α, β ≠0?
→ regression analysis
statistical models III
regression model:
y = ax + b + "
!minimize: (yresidual)2= (ymeasured-ypredicted)2
ssxy cov(xy)
a=
=
ssxx
var(x)
!
regression coefficient
statistical models III
correlaIon 2
ss
xy
r2 =
ssxx ssyy
coefficient of determination: proportion of variance „explained“ by regression
!
cov( xy )
r=
var( x) var( y )
coefficient of correlation:
„scaled“ regression coeff.
statistical models III
regression model: parIIoning the sum of squares model:
y = ax + b + "
y
predicted
2
sstotal = # ( y i " y ) = ssregression + sserror
!
i
sserror = # ( y i " y i, predicted )
error
2
x
i
2
ssregression = # ( y i, predicted " y ) = b
i
2
#( x
i
i
" x)
2
MVA
2. mulIvariate analysis MVA
example of a measurement protocol body
mass
indiv 1
indiv 2
indiv 3
...
stature
sex
...
MVA
mulIvariate analysis •  N objects (cases, specimens, ...) •  P variables (features, ...) •  -­‐-­‐> X = (N x P) matrix of measurements " x11
$
...
$
X=
$ x i1
$
# x N1
x12
...
x i2
...
...
...
x ij
...
x1P %
'
... '
x iP '
'
x NP &
MVA
covariance, variance and standard deviaIon ss: sums of square deviations from mean
var: variance
cov: covariance
s: standard deviation
ssxy = ∑ (xi − x )( yi − y )
i
ssxx = ∑ (xi − x )(xi − x )
i
ssyy = ∑ ( yi − y )( yi − y )
i
1
cov(x, y) = ssxy
N
1
var(x) = ssxx = sx2
N
1
var(y) = ssyy = sy2
N
var(x + y) = var(x) + var(y) + cov(x + y)
!
MVA
two perspecIves var 1"
P-dimensional
vector of variables
N objects
P variables
€
⎛ x11
⎜
x
21
⎜
X=
⎜ ...
⎜
⎝ xN1
var 2"
var j"
x12
x22
...
...
xij
xNj
...
...
N-dimensional
vector of subject
var P"
x1P ⎞
⎟
x2P ⎟
xiP ⎟
⎟
xNP ⎠
subj 1"
subj 2"
subj i"
subj N"
MVA
two types of spaces variable X3"
subject 3"
subject 1"
subject 3"
variable X1"
variable X2"
variable X3"
subject 2"
subject 2"
variable X1"
variable X2"
space axes are variables:
variable space
subject 1"
space axes are subjects:
subject space
MVA
two types of spaces: example body height!
Clara!
Anna"
Clara"
age"
body mass"
body height"
Berta"
age!
body mass!
space axes are variables:
variable space (feature space)
Anna!
Berta!
space axes are subjects:
subject space (specimen space)
MVA
mulIvariate analysis: paXern extracIon aims of MVA: variable space
•  define "natural" system of reference •  find trends (covariaIon) •  reduce nr. of variables to a few significant ones •  discriminate between groups MVA
Principal Component Analysis (PCA) •  orthogonal basis vectors (= eigenvectors) of data matrix X: –  staIsIcally independent variables –  capture significant paXerns of correlaIon in the sample MVA
subject space geometry:
statistics:
subject 3
Variable a
angle α:
cos α =
a⋅ b
ab
2
2
correlation r:
Variable b
subject 2
subject 1
r=
=
(scalar product
of two vectors)
€
sab
saa sbb
cov( ab)
var( a) var( b)
MVA
how to find orthogonal basis vectors V E = Λ E
data matrix
eigenvector
matrix
eigenvector matrix
eigenvalues
à  eigenvectors are orthonormal to each other (statistically independent of each other)
à eigenvectors are the invariants of the data set
3. Geometric Morphometrics size, shape and shape space GM intro
physical space and feature space d3
d3
d4
d2
d1
d1
Euclidean space (3D)
d2
multivariate Euclidean space (n-D)
GM intro
loss of geometry d3
d3
d4
d2
d1
?
d1
d2
GM intro
loss of geometry •  different specimens are at the same posiIon in feature space •  no way back from feature space to physical space GM intro
aims of geometric morphometrics •  explicit inclusion of physical geometric properIes into mulIvariate analysis •  direct correspondence between biological homology and geometric homology •  methods to visualize homology transformaIons GM intro
size and shape •  form = size (extension) [Grösse] and shape (geometry) [Gestalt] •  size: a scalar (number) •  shape: a mulIdimensional vector GM intro
form = size + shape
Form
Shape
Size
scaling
reference shape
rotation
translation
shape space
GM intro
measuring size: Centroid Size S S=
∑x−x
i
Centroid
i
2
GM intro
measuring shape 1.  devia:on of an object from a reference shape in "linearized Procrustes space" 2.  spaIal deforma:on of a reference shape: Thin Plate Spline interpolaIon funcIon shape space
GM example
example: calculaIng size and shape cave art from Niaux
GM example
separaIon of size and shape GM example
mirroring (scaling with factor -­‐1) GM example
normalizaIon to size=1 GM example
rotaIon GM example
superposiIon generalized least-squares fitting (GLS)
GM example
reference shape: consensus GM example
shape = deviaIon vom consensus LM1: x1
y1
LM2: x2
y2
LM3: x3
y3
LM4: x4
y4
LM5: x5
y5
LM4
€
LM2
LM5
classical multivariate
analysis
LM3
LM1
shape component 2
GM example
aim: mulIvariate analysis in Shape Space f(age, size)
0.0
0.0
shape component 1
GM theory
superposiIon •  calculate consensus •  translate, rotate all objects unIl squared deviaIons from consensus are minimized •  -­‐-­‐> generalized least-­‐squares finng (GLS) •  -­‐-­‐> cf. mulIdimensional regression GM theory
generalized least-­‐squares analysis (GLA) (Procrustes SuperposiIon) •  scale all specimens to S=1 •  translate centroids of all specimens to coordinate origin •  take specimen 1 as provisional consensus •  rotate all specimens to consensus (least-­‐squares fit: D minimal) •  mean of all specimens = new consensus •  repeat unIl D reaches minimum D=
∑x −x
i
i
2
GM theory
issues of shape analysis •  geometric properIes of shape space? •  staIsIcal properIes of shape space? •  centroid size and shape space: biological significance? GM theory
geometry: plane and sphere GM theory
example: shape space of all possible triangles triangular forms
triangular shapes
size (1 DF)
y3
y1
y2
x1
x3
x2
translation (2 FG)
rotation (1 FG)
6 degrees of freedom (DF)
→ 6 dimensions
2 DF
→ 2 dimensions
GM theory
geometry of the shape space of triangles equiangular triangle
isosceles
isosceles
•  corresponds to unit sphere •  distance metric: great-­‐circle segment (angle) •  non-­‐linear space! flat
inverted equiangular triangle
Kendall, D. G. (1989). A survey
of the statistical theory of shape.
Statistical Science 4, 87-120. GM theory
Linearized Procrustes Shape Space consensus shape
specimen shape
GM theory
shape: linearized deviaIon from local reference point (consensus) sphere
(non-linear geogr. space)
n-dimensional hypersphere
(non-linear shape space)
specimen
reference point
linearized map
reference point (Consensus)
linearized procrustes space
GM theory
linearizaIon •  works well for small distances in shape space •  rule for biological samples: linearizaIon works well for largely similar objects GM theory
linearizaIon of shape space ???
€
giraffe
Consensus
???
giraffe
aurochs
Consensus
GM theory
superposiIon: which criteria? •  criteria of superposiIon determine: –  the shape of the consensus –  the linearized system of coordinates of shape space biologically optimized
statistically optimized (GLS)
GM analysis
data analysis in shape space •  classical mulIvariate analysis in linearized shape space: –  PCA of "Procrustes residuals" (deviaIon of specimens from consensus, landmark coordinate by landmark coordinate) → reduc2on of dimensionality –  back-­‐projecIon from PCA space to physical space → visualiza2on of results in terms of pa:erns of shape varia2on GM analysis
visualization
juvenile to adult
GM analysis
geometric morphometrics GLS
PCA
PC3
PC1
physical space (3D)
PC2
shape space (nD)
PC space: (n=3K-7; K=nr. of LM)
reduction of dimensions
GM analysis
geometric morphometrics GLS
PCA
PC3
PC1
physical space (3D)
PC2
shape space (nD)
PC space: (n=3K-7; K=nr. of LM)
reduction of dimensions
GM analysis
PCA in linearized shape space covariance matrix
of data matrix V:
Sv =
Singular Value Decomposition of Sv:
PCA scores of specimens:
virtual specimen v* from PCA score p*:
1
n
T
(v − v )(v − v ) ; v : consensus
∑
n
i
c
i
c
c
i= 1
Sv = EΛE
T Λ: diagonal matrix of eigenvalues
E: matrix of eigenvectors
T
P = E V ʹ′; V ʹ′ = (vi − vc)
v = vc + p E
*
*
Related documents