Download IB Biology Topic 1: Statistical Anaylsis

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Regression toward the mean wikipedia , lookup

Misuse of statistics wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Modify—use bio. IB book 
IB Biology Topic 1:
Statistical Analysis
http://www.patana.ac.th/Second
ary/Science/c4b/1/stat1.htm
An investigation of shell length
variation in a mollusc species
• A marine gastropod (Thersites bipartita) has
been sampled from two different locations:
– Sample A: Shells found in full marine conditions
– Sample B: Shells found in brackish water
conditions.
• sample size = 10 shells
• length of the shell measured as shown
Analysis of Gastropod Data
•
•
•
•
measured height of shells (ruler)
Units: mm + / - 1 mm (ERROR)
Significant digits
Uncertainty
must be consistent.
– all measuring devices!
– reflects the precision of the measurement
• There should be no variation in the precision of raw
data
1.1.1 Error bars and the
representation of variability in data.
• Biological systems are subject to a genetic
program and environmental variation
• collect a set of data  it shows variation
• Graphs: show variation using error bars
– show range of the data or
– standard deviation
Mean & Range for each group
• Marine
• Brackish
Graph Mean & Range for each group
• Quick
comparison
of the 2
data sets
1.1.2 Calculation of Mean and Std Dev
• 3 classes of data
• Mean
– arithmetic mean (avg): measure of the central
tendency (middle value)
• Std Dev
– Measures spread around the mean
– Measure of variation or accuracy of measurement
1.1.2 Calculation of Mean and Std Dev
• Std Dev of sample = s
• is for the sample not the
total population
• Pop 1. Mean = 31.4
s = 5.7
• Pop 2. Mean =41.6
s = 4.3
Graphing Mean and Std Dev: Error Bars
• Mean +/- 1 std dev
• no overlap between
these two populations
• The question being
considered is:
– Is there a significant
difference between the
two samples from
different locations?
• or
– Are the differences in
the two samples just
due to chance
selection?
Graphing Mean and Std Dev: Error Bars
StdDev graph compares
68% of the population
% begins to show that
they look different.
Range graph :
misleads us to think
the data may be similar
1.1.3 Standard deviation and the
spread of values around the mean.
1. StdDev is a measure of how spread out the
data values are from the mean.
2. Assume:
1. normal distribution of values around the
mean
2. data not skewed to either end
3. 68% of all the data values in a sample can
be found between the mean +/- 1 standard
deviation
http://www.patana.ac.th/Secondary/
Science/c4b/1/stat1.htm#gastro
• Animation of mean and standard deviation
1.1.3 Standard deviation and the
spread of values around the mean.
4. 95% of all the data values in a sample can
be found between the mean + 2s and the
mean -2s.
1.1.4 Comparing means and standard
deviations of 2 or more samples.
Sample w/ small StdDev suggests narrow variation
Sample w/ larger StdDev suggests wider variation
Example: molluscs
Pop 1. Mean = 31.4 Standard deviation(s)= 5.7
Pop 2. Mean =41.6 Standard deviation(s) = 4.3
1.1.4 Comparing means and standard
deviations of 2 or more samples.
Pop 2 has a greater mean shell length but
slightly narrower variation.
Why this is the case would require further
observation and experiment on
environmental and genetic factors.
http://www.patana.ac.th/Secondary/Science/c4b/1/stat1.htm#gastro
1.1.5 Comparing 2 samples with t-Test
Null Hypothesis:
There is no significant difference between
the two samples except as caused by
chance selection of data.
OR
Alternative hypothesis:
There is a significant difference between
the height of shells in sample A and sample
B.
http://www.patana.ac.th/Secondary/Science/c4b/1/stat1.htm#gastro
1.1.5 Comparing 2 samples with t-Test
For the examples you'll use in biology, tails is always 2 , and type can be:
1, paired
2,Two samples equal variance
3, Two samples unequal variance
Good idea to graph it
• Bar chart
• Error bars
• Stats
T-test: Are the mollusc shells from the
two locations significantly different?
• T-test tells you the probability (P) that the 2
sets are basically the same. (null hypothesis)
• P varies from 0 (not likely) to 1 (certain).
– higher P = more likely that the two sets are the
same, and that any differences are just due to
random chance.
– lower P = more likely that that the two sets are
significantly different, and that any differences are
real.
T-test: Are the mollusc shells from the
two locations significantly different?
• In biology the critical P is usually 0.05 (5%)
(biology experiments are expected to
produce quite varied results)
– If P > 5% then the two sets are the same
• (i.e. accept the null hypothesis).
– If P < 5% then the two sets are different
• (i.e. reject the null hypothesis).
• For t test, # replicates as large as possible
– At least > 5
Drawing Conclusions
1. State null hypothesis & alternative hypothesis
(based on research ?)
2. Set critical P level at P=0.05 (5%)
3. Write the decision rule—
If P > 5% then the two sets are the same (i.e. accept
the null hypothesis).
If P < 5% then the two sets are different (i.e. reject
the null hypothesis).
4. Write a summary statement based on the decision.
The null hypothesis is rejected since calculated
P = 0.003 (< 0.05; two-tailed test).
5. Write a statement of results in standard English.
There is a significant difference between the height
of shells in sample A and sample B.
1.1.6 Correlation & Causation
• Sometimes you’re looking for an association
between variables.
• Correlations see if 2 variables vary together
+1 = perfect positive correlation
0 = no correlation
-1 = perfect negative correlation
• Relations see how 1 variable affects another
Pearson correlation (r)
• Data are continuous
& normally
distributed
Spearman’s rank-order correlation (r s)
• Data are not continuous
& normally distributed
• Usually scatterplot for
either type of correlation
• both correlation
coefficients indicate a
strong + corr.
– large females pair with
large males
– Don’t know why, but it
shows there is a
correlation to investigate
further.
Causative: Use linear regression
• Fits a
straight line
to data
• Gives slope
& intercept
– m and c in
the equation
y = mx + c
Doesn’t PROVE causation, but
suggests it...need further investigation!