Download L13 Primer workshop by Colin bates

Document related concepts

Data assimilation wikipedia , lookup

Transcript
Multivariate analysis of community
structure data
2001 Oct
2001 April
2001 Oct
2001 April
2001 April
2001 April
2001 April
2001 July
2001 July
2001 July
2000 Aug
2001 July
2001 Oct
2001 Oct
2001 Oct
2001 Oct
2000 May
2000 May
2000 Aug
2000 Aug
2000 Aug
2001 April
2001 Oct
2000 Aug
2000 Aug
2001 July
2001 July
2000 Aug
2001 July
2001 July
2001 April
2001 April
2000 Nov
2000 Nov
2000 Nov
2000 Nov
2001 Oct
2000 Aug
2000 Nov
2000 Nov
2000 Nov
2000 Nov
2000 May
2000 May
2000 May
2000 May
2000 May
2000 May
20
Colin Bates
UBC
40
60
80
100
Bamfield Marine Sciences Centre
Similarity
Goals
1) To understand the ideas behind
multivariate community structure
analysis.
2) To understand how to perform these
analyses in PRIMER.
3) To be prepared to analyse and
interpret your class data later today.
What are multivariate statistics?
Statistics that allow us to look at how
multiple variables change together
What are multivariate statistics?
Statistics that allow us to look at how
multiple variables change together:
EG: How do 50 species in a community
react to an environmental perturbation?
What are multivariate statistics?
Statistics that allow us to look at how
multiple variables change together:
EG: How do 50 species in a community
react to an environmental perturbation?
50 ANOVAs?
What are multivariate statistics?
Statistics that allow us to look at how
multiple variables change together:
EG: How do 50 species in a community
react to an environmental perturbation?
50 ANOVAs? No…
Multivariate stats allow us to
“condense” information for simplicity
When might I use this type of analysis?
For a multi-species community, you
may wish to:
- pull order from complex systems
- visualize these patterns
- comparisons over time and space
- test hypotheses
The vehicle:
Example: Seaweed Communities at Cape Beale
- Is flora
different at two
close sites,
each exposed
to different
wave
intensity?
Data collection:
2. Data Analysis
Step 1: Entering your data into
PRIMER
How to analyze this type of data?
1. Diversity indices
How to analyze this type of data?
1. Diversity indices
Yet, most diversity indices do not
consider species identity…
How to analyze this type of data?
1. Diversity indices
Yet, most diversity indices do not
consider species identity…
Multivariate community structure analyses
Analysis flow
samples
species
aa
a
b
bb
c
sample
similarities
aa b
a bb
c
cc
are sites
different?
How?
c
c
ordination
Analysis flow
samples
species
aa
a
b
bb
c
sample
similarities
c
c
ordination
Calculate Bray – Curtis Similarity
 gives a triangular similarity matrix
within
within
between
Analysis flow
samples
species
aa
a
b
bb
c
sample
similarities
aa b
a bb
c
cc
are sites
different?
How?
c
c
ordination
Visualizing similarities
Ordination “maps” similarity relationships
between samples
a
a
b
a
b
b
c
c
c
ordination
nMDS ordination example
nMDS ordination example
Distance between points reflects relative similarity!
Nonmetric multidimensional scaling
(nMDS)
“the future of ordination is in nonmetric multidimensional
scaling” – McCune & Grace, 2002
Nonmetric: no axes
Multidimensional: represents relationships between
multiple variables in two or three dimensions
Scaling: the ratio between reality and representation
How does nMDS work?
nMDS uses the RANK ORDER of similarity relationships
between samples:
Sample
Sample
rank
A2
%
similarity
99%
A1
A1
A3
96%
2
A2
A3
95%
3
1
A1 is closer to A2 than it is to A3
How does nMDS work?
Then, nMDS tries to place points in 2 (or 3) dimensional
space to represent this ranked order:
A3
A1 is closer to A2
than it is to A3
A1
A2
How does nMDS work?
Then, nMDS tries to place points in 2 (or 3) dimensional
space to represent this ranked order:
A1
A2
A3
A1 is closer to A2
than it is to A3
How accurate is the nMDS map?
- Sometimes the nMDS can’t represent all relationship
accurately
- this is reflected by a high STRESS value
How accurate is the nMDS map?
- Sometimes the nMDS can’t represent all relationship
accurately
similarity in sim. matrix
- this is reflected by a high STRESS value
. ..
. ..
.
...
. .
. . .. .
..
.
distance on nMDS
If Stress Value =
0.0 : perfect map
0.1 : decent map
0.2 : ok map
0.3 : don’t bother
Main points about ordination!
- Ordination is a way to visualize how similar your samples
are
- nMDS tries to represent visually the rank order within the
underlying similarity matrix
- all that matters is the relative distance between points.
- stress value allows you to estimate ‘quality’ of the nMDS’
aa
a
b
bb
c
sample
similarities
c
c
ordination
Obviously distinct groups
Less obvious! Are they really different?
Analysis flow
samples
species
aa
a
b
bb
c
sample
similarities
aa b
a bb
c
cc
are sites
different?
c
c
ordination
Analysis flow
samples
species
aa
a
b
bb
c
sample
similarities
aa b
a bb
c
cc
are sites
different?
How?
c
c
ordination
Are groups different?
Analysis of Similarities – a statistical approach
exposed
sheltered
Are groups different?
Analysis of Similarities – a statistical approach
Ho = sites the same
Ha = sites are different
exposed
sheltered
If Ho (sites the same) = true
Similarity within = Similarity between
If Ha (sites different) = true
Similarity within > Similarity between
Are groups different?
Analysis of Similarities – a statistical approach
(rbetween - rwithin )
R=
standardizing factor
Are groups different?
Analysis of Similarities – a statistical approach
(rbetween - rwithin )
R=
~1
If Ho (sites the same) = true
Similarity within = Similarity between
(rbetween - rwithin )
R=
~1
~0
If Ha (sites different) = true
Similarity within > Similarity between
(rbetween - rwithin )
R=
~1
~1
To simulate null distribution
To simulate null distribution
Similarity within = Similarity between
To simulate null distribution
Similarity within = Similarity between
Calculate R
To simulate null distribution
Similarity within = Similarity between
Calculate R
Phyc 2003 Practice data set
243
232
Frequency
189
109
.477
88
58
35
19
10
6
-0.20
9
1
-0.15
-0.10
-0.05
0.00
0.05
0.10
R
0.15
0.20
0.25
0.30
0.35
0.40
Phyc 2003 Practice data set
243
232
1
P=
= 0.001
999
Frequency
189
109
.477
88
58
35
19
10
6
-0.20
9
1
-0.15
-0.10
-0.05
0.00
0.05
0.10
R
0.15
0.20
0.25
0.30
0.35
0.40
Analysis flow
samples
species
aa
a
b
bb
c
sample
similarities
aa b
a bb
c
cc
are sites
different?
How?
c
c
ordination
Sites are different – why?
• We will use the SIMPER routine:
- Similarity Percentages
Basically indicates which species are
responsible for the patterns that we see.
Data analysis summary
samples
species
aa
a
b
bb
c
sample
similarities
are sites different?
aa b
a bb
c
cc
ANOSIM
How?
SIMPER
c
nMDS
c