Download 4. Statistics for A2 Biology

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Statistics for A2 Biology
Standard deviation
Student’s t-test
Chi squared
Spearman’s rank
Why?
Statistical tests allow us to draw conclusions
about data based on statistical significance.
e.g. Is there a significant difference between
the mean heights of students in different
year groups?
e.g. Is there significant correlation between
the height and age of students in a school?
Standard Deviation
• A measure of how ‘spread out’ data is
around a mean.
• Allows us to compare two or more sets of
data to see whether the means are
significantly different.
Limitations:
- doesn’t give you the range of data
- can be affected by outliers/chance
results
Calculating Standard Deviation
1. Subtract the mean of the data from each
data point.
2. Square your answers.
3. Add them all together.
4. Divide your answer by the number of data
points minus 1.
5. Take the square root.
OR use the calculator method!
What next?
• Compare the mean+standard deviation for
the data sets.
• If these ranges overlap, there is no
significant difference between the means
of the data sets.
• If the ranges do not overlap, there is a
significant difference between the means.
Statistical Tests
• Student’s t-test – are two mean results
significantly different?
• Spearman’s Rank Correlation Coefficient –
is there significant correlation between
data sets?
• X2 (Chi-Squared) – is there a significant
difference between observed and
expected results?
Null hypothesis
Results of an experiment could be due to
random chance.
Only way to support your hypothesis is to
reject a null hypothesis.
Null hypothesis states there is no
link/correlation/difference between results.
→ Depends on statistical test used.
Statistics
- State null hypothesis
- Which test will you use?
- Why?
- Calculate test statistic
- Interpret the test statistic in relation to your
null hypothesis. Use the words probability
and chance in your answer.
Probabilities
• We normally work at the 5% probability
level (P=0.05).
• To reject the null hypothesis (and accept
your own hypothesis), you must be sure
that there is ≤5% probability that the
results are due to chance.
William Gosset
(aka ‘Student’)
(1876-1937)
Worked in quality
control at the
Guinness brewery
and could not
publish under his
own name.
Former student of
Karl Pearson
The t-test
What can this test tell you?
If there is a statistically significant
difference between two means, when:
The sample size is less than 25.
The data is normally distributed
BETTER THAN STANDARD DEVIATION!
t-test
t=
x1 – x2
 (s12/n1) + (s22/n2)
SD = 
x1 = mean of first sample
x2 = mean of second sample
s1 = standard deviation of first sample
s2 = standard deviation of second sample
n1 = number of measurements in first sample
n2 = number of measurements in second sample
(x – x)2
n–1
Worked example
Does the pH of soil affects seed
germination of a specific plant species?
• Group 1: eight pots with soil at pH 5.5
• Group 2: eight pots with soil at pH 7.0
• 50 seeds planted in each pot and the
number that germinated in each pot was
recorded.
What is the null hypothesis (H0)?
H0 = there is no statistically significant difference
between the germination success of seeds in two
soils of different pH
HA = there is a significant difference between the
germination of seeds in two soils of different pH
If the value for t exceeds the critical value (P =
0.05), then you can reject the null hypothesis.
Construct the following table…
Pot
Group 1
(pH5.5)
(x – x)2
Group 2
(pH7.0)
(x – x)2
1
38
1.27
39
20.25
2
41
3.52
45
2.25
3
43
15.02
41
6.25
4
39
0.02
46
6.25
5
37
4.52
48
20.25
6
38
1.27
39
20.25
7
41
3.52
46
6.25
8
36
9.77
44
0.25
Mean
39.1
1.27
43.5
20.25

38.88
82.0
Calculate standard deviation for both groups
Group 1:
SD = 
(x – x)2
n–1
=

38.88
= 2.36
8–1
Group 2:
SD = 
(x – x)2
n–1
=

82.0
8–1
= 3.42
Using your means and SDs, calculate value
for t
x 1 – x2
t=
 (s12/n1) + (s22/n2)
39.1 – 43.5
t=
-4.4
=

(2.362/8)
+
(3.422/8)
 0.696 + 1.462
t = -2.99 BUT we can ignore the - sign
Compare your value of t with the appropriate
critical value:
If your value is lower than
the critical value:
If your value is higher than
the critical value:
- there is no significant
difference between data
sets
- there is a significant
difference between data
sets
- accept null hypothesis
- reject null hypothesis
- >5% probability the
difference in results is
due to chance.
- ≤5% probability the
difference in results is
due to chance.
Compare our calculated value of r with the relevant critical
value in the stats table of critical values
Our value of t = 2.99
Degrees of freedom = n1 + n2 – 2 = 14
D.F.
Critical Value
(P = 0.05)
14
15
16
17
18
2.15
2.13
2.12
2.11
2.10
Our value for t exceeds the
critical value, so we can
reject the null hypothesis.
We can conclude that there is a
significant difference between the
two means, so pH does affect the
germination rate for this plant.
Now try the examples on the sheet.
Remember:
1. State your null hypothesis.
2. Calculate the mean and standard deviation.
3. Calculate the value of t.
4. Compare the value of t with the critical value.
5. Write a conclusion, stating:
1. Whether or not the confidence limits overlap
2. Whether you accept or reject the null hypothesis
3. What the probability is that the differences between
means occurred by chance.
Why use
Spearman’s
rank?
Spearman’s Rank
What can this test tell you?
If there is a statistically significant
correlation between two sets of
measurements from the same sample.
What is the null hypothesis?
There is no correlation between ………….
Critical values. We use the 0.05 significance (probability) level
– this is all you will be given in the EMPA.
Calculating Spearman’s Rank
1. State null hypothesis
2. Rank the data.
3. Calculate the correlation coefficient.
4. Compare rs with table of critical values.
Worked example
The table shows the
mass of nitrogen in
fertiliser added to
fields and the mean
concentration of
nitrates in nearby
streams.
Is there a significant
correlation between
the two variables?
41
Concentration
of nitrates in
stream
(mg dm-3)
1.2
41
1.3
51
1.5
56
1.8
63
1.6
69
1.9
72
2.0
Mass of N
on fields
(kg ha-1)
What is the null hypothesis?
There is no statistically significant correlation
between the mass of nitrogen in fertiliser
added to fields and the mean concentration of
nitrates in nearby streams.
Step 2: Ranking the data
MassMake
of
sureConc.
you of
rank bothDifference
sets of
Rank nitrates
Rank
N
in rank
(D)
data
in
the
same
direction
(i.e.
-1
-3
(kg ha )
(mg dm )
D2
lowest to highest OR highest to
1
0.5
41
1.2
1.5
lowest, NOT one each way).
0.25
41
1.5
-0.5
0.25
51
3 data
1.5
3
Always
keep the
4 its original
5
1.8
in
pairs.
0
0
-1
1
56
1.3
2
63
5
1.6
4
1
1
69
6
1.9
6
0
0
72
7
2.0
7
0
0
Step 3: Calculate rs
• Add up the D2 values to give ∑D2:
∑D2 = 0.25 + 0.25 + 1 + 1 = 2.5
• Calculate rs:
rs =
rs =
1-
1-
6 x ΣD2
n3 - n
6 x 2.5
= 1 – 0.045 = 0.955
3
7 -7
• rs is always between -1 and 1.
Step 4: Compare your value of rs with the
appropriate critical value:
Find the critical
value for 7 pairs of
measurements.
Is your answer
higher or lower
than the critical
value?
Step 4: Compare your value of rs with the
appropriate critical value:
There is statistically significant
positive correlation between
the mass of nitrogen and the
concentration of nitrates.
0.955>0.79,
therefore we
reject the null
hypothesis.
There is ≤5%
probability that the
correlation in
results is due to
chance.
Step 4: Compare your value of rs with the
appropriate critical value:
If your value is lower than
the critical value:
If your value is higher than
the critical value:
- there is no significant
correlation
- there is significant
correlation
- accept null hypothesis
- reject null hypothesis
- >5% probability the
correlation in results is
due to chance.
- ≤5% probability the
correlation in results is
due to chance.
rs values below 0 → negative correlation.
rs values above 0 → positive correlation.
Calculating Spearman’s Rank
1. State null hypothesis
2. Rank the data.
3. Calculate the correlation coefficient.
4. Compare rs with table of critical values.
5. Write your conclusion, referring to the
critical value, the null hypothesis,
probability and chance.
Why use Χ2
(chi squared)?
2
Χ
What can this test tell you?
If there is a statistically significant
difference between observed and
expected results.
What is the null hypothesis?
There is no difference between observed
and expected results.
Critical
values.
We use the
0.05
significance
(probability)
level – this
is all you
will be
given in the
EMPA.
Calculating Χ2
1. State null hypothesis
2. Calculate Χ2
3. Compare Χ2 with table of critical values.
Worked example
• The table shows the
number of people
living on each side of
a river who died from
cancer in one year.
• Is there a significant
difference in the Side of river North South
death rates?
Death rate
(per 100
people)
26
12
What is the null hypothesis?
There is no statistically significant difference
between the death rates on each side of the
river.
Step 2: Calculate X2
Deaths from cancer in people living
Observed results
(O)
Expected results
(E)
O–E
(O – E)2
(O – E)2
E
North of river
South of river
26
12
Calculate expected results
19
19
according to your null
7 hypothesis. -7
49 data, that means
49 take
For this
the mean!
2.6
2.6
Step 2: Calculate X2
Add up the values for (O – E)2
E
Χ2 =
Σ
(O – E)2
E
=
2.6 + 2.6 = 5.2
Step 3: Compare your value of X2 with the
appropriate critical value:
Number of
degrees of
freedom = n - 1
Is your answer
higher or lower
than the critical
value?
Step 4: Compare your value of X2 with the
appropriate critical value:
There is a statistically
significant difference between
the death rates on each side
of the river.
5.2>3.84,
therefore we
reject the null
hypothesis.
There is ≤5%
probability that the
difference in
results is due to
chance.
Step 4: Compare your value of X2 with the
appropriate critical value:
If your value is lower
than the critical value:
If your value is higher
than the critical value:
- there is no significant
difference between
results
- there is a significant
difference between
results
- accept null
hypothesis
- reject null
hypothesis
- >5% probability the
difference in results is
due to chance.
- ≤5% probability the
difference in results is
due to chance.
Calculating
2
Χ
1. State null hypothesis
2. Calculate Χ2
3. Compare Χ2 with table of critical values.
4. Write your conclusion, referring to the
critical value, the null hypothesis,
probability and chance.
Assessment
• Solve the six problems on the sheet.
• Each problem is marked as follows:
– Null hypothesis – 1 mark
– Give your choice of test – 1 mark
– Say why you’ll use that test – 1 mark
– Calculate the test statistic – 1 mark
– Interpret the test statistic in relation to your
null hypothesis. Use the words probability
and chance in your answer. – 2 marks
Related documents