Download Statistics for EES 2. Standard error

Document related concepts
no text concepts found
Transcript
Statistics for EES
2. Standard error
Dirk Metzler
http://evol.bio.lme.de/_statgen
May 9, 2011
Contents
1
Histograms: Densities or Numbers?
2
Computing σ with n or n − 1?
3
Mean values are usually nice but sometimes mean
example: picky wagtails
example: spider men & spider women
example: copper-tolerant browntop bent
4
The standard error SE
example: drought stress in sorghum
general consideration
Histograms: Densities or Numbers?
Contents
1
Histograms: Densities or Numbers?
2
Computing σ with n or n − 1?
3
Mean values are usually nice but sometimes mean
example: picky wagtails
example: spider men & spider women
example: copper-tolerant browntop bent
4
The standard error SE
example: drought stress in sorghum
general consideration
Histograms: Densities or Numbers?
0 2 4 6 8
Number
Number vs. Density
0
1
2
3
4
5
6
Histograms: Densities or Numbers?
0 2 4 6 8
0
1
0
1
2
3
4
5
6
8
4
0
Number
12
Number
Number vs. Density
2
3
4
5
6
7
Histograms: Densities or Numbers?
0 2 4 6 8
0
1
2
3
4
5
6
0
1
2
3
4
5
6
7
0
1
2
3
4
5
6
7
8
4
0.3
0.0
Density
0.6
0
Number
12
Number
Number vs. Density
Histograms: Densities or Numbers?
0 2 4 6 8
Number
Number vs. Density
0
1
2
3
4
5
4
8
12
Histograms
with
unequal intervals
should show densities, not numbers!
0
Number
6
1
2
3
4
5
6
7
0
1
2
3
4
5
6
7
0.3
0.0
Density
0.6
0
Computing σ with n or n − 1?
Contents
1
Histograms: Densities or Numbers?
2
Computing σ with n or n − 1?
3
Mean values are usually nice but sometimes mean
example: picky wagtails
example: spider men & spider women
example: copper-tolerant browntop bent
4
The standard error SE
example: drought stress in sorghum
general consideration
Computing σ with n or n − 1?
Simulated population (N=10000 adults)
20
22
24
26
Length [cm]
28
30
Computing σ with n or n − 1?
Simulated population (N=10000 adults)
Mean: 25.13
20
22
24
26
Length [cm]
28
30
Computing σ with n or n − 1?
Simulated population (N=10000 adults)
Mean: 25.13
Standard deviation: 1.36
20
22
24
26
Length [cm]
28
30
Computing σ with n or n − 1?
Simulated population (N=10000 adults)
Mean: 25.13
Standard deviation: 1.36
20
22
24
26
28
30
28
30
Length [cm]
Sample from the population (n=10)
● ● ●●
● ● ●● ● ●
20
22
24
26
Length [cm]
Computing σ with n or n − 1?
Simulated population (N=10000 adults)
Mean: 25.13
Standard deviation: 1.36
20
22
24
26
28
30
28
30
Length [cm]
Sample from the population (n=10)
M: 24.43
● ● ●●
● ● ●● ● ●
20
22
24
26
Length [cm]
Computing σ with n or n − 1?
Simulated population (N=10000 adults)
Mean: 25.13
Standard deviation: 1.36
20
22
24
26
28
30
28
30
Length [cm]
Sample from the population (n=10)
M: 24.43
SD with (n−1): 1.15
20
● ● ●●
● ● ●● ● ●
22
24
26
Length [cm]
Computing σ with n or n − 1?
Simulated population (N=10000 adults)
Mean: 25.13
Standard deviation: 1.36
20
22
24
26
28
30
28
30
Length [cm]
Sample from the population (n=10)
M: 24.43
SD with (n−1): 1.15
SD with n: 1.03
20
● ● ●●
● ● ●● ● ●
22
24
26
Length [cm]
Computing σ with n or n − 1?
Simulated population (N=10000 adults)
Mean: 25.13
Standard deviation: 1.36
20
22
24
26
28
30
28
30
Length [cm]
Sample from the population (n=10)
M: 24.43
SD with (n−1): 1.15
SD with n: 1.03
20
● ● ●●
● ● ●● ● ●
22
24
26
Length [cm]
Another sample from the population (n=10)
●
20
22
● ● ●●
●
●
24
● ●
26
Length [cm]
●
28
30
Computing σ with n or n − 1?
Simulated population (N=10000 adults)
Mean: 25.13
Standard deviation: 1.36
20
22
24
26
28
30
28
30
Length [cm]
Sample from the population (n=10)
M: 24.43
SD with (n−1): 1.15
SD with n: 1.03
20
● ● ●●
● ● ●● ● ●
22
24
26
Length [cm]
Another sample from the population (n=10)
M: 24.92
●
20
22
● ● ●●
●
●
24
● ●
26
Length [cm]
●
28
30
Computing σ with n or n − 1?
Simulated population (N=10000 adults)
Mean: 25.13
Standard deviation: 1.36
20
22
24
26
28
30
28
30
Length [cm]
Sample from the population (n=10)
M: 24.43
SD with (n−1): 1.15
SD with n: 1.03
20
● ● ●●
● ● ●● ● ●
22
24
26
Length [cm]
Another sample from the population (n=10)
M: 24.92
SD with (n−1): 1.61
20
●
22
● ● ●●
●
●
24
● ●
26
Length [cm]
●
28
30
Computing σ with n or n − 1?
Simulated population (N=10000 adults)
Mean: 25.13
Standard deviation: 1.36
20
22
24
26
28
30
28
30
Length [cm]
Sample from the population (n=10)
M: 24.43
SD with (n−1): 1.15
SD with n: 1.03
20
● ● ●●
● ● ●● ● ●
22
24
26
Length [cm]
Another sample from the population (n=10)
M: 24.92
SD with (n−1): 1.61
SD with n: 1.45
20
●
22
● ● ●●
●
●
24
● ●
26
Length [cm]
●
28
30
Computing σ with n or n − 1?
0.8
0.0
Density
1000 samples, each of size n=10
0.5
1.0
1.5
2.0
2.5
2.0
2.5
0.8
0.0
Density
SD computed with n−1
0.5
1.0
1.5
SD computed with n
Computing σ with n or n − 1?
0.8
0.0
Density
1000 samples, each of size n=10
0.5
1.0
1.5
2.0
2.5
2.0
2.5
0.8
0.0
Density
SD computed with n−1
0.5
1.0
1.5
SD computed with n
Computing σ with n or n − 1?
0.8
0.0
Density
1000 samples, each of size n=10
0.5
1.0
1.5
2.0
2.5
2.0
2.5
0.8
0.0
Density
SD computed with n−1
0.5
1.0
1.5
SD computed with n
Computing σ with n or n − 1?
Computing σ with n or n − 1?
The standard deviation σ of a random variable with n equally
probable outcomes x1 , . . . , xn (z.B. rolling a dice) is clearly
defined by
v
u n
u1 X
t
(x − xi )2 .
n
i=1
Computing σ with n or n − 1?
Computing σ with n or n − 1?
The standard deviation σ of a random variable with n equally
probable outcomes x1 , . . . , xn (z.B. rolling a dice) is clearly
defined by
v
u n
u1 X
t
(x − xi )2 .
n
i=1
If x1 , . . . , xn is a sample (the usual case in statistics) you should
rather use the formula
v
u
n
u 1 X
t
(x − xi )2 .
n−1
i=1
Mean values are usually nice but sometimes mean
Contents
1
Histograms: Densities or Numbers?
2
Computing σ with n or n − 1?
3
Mean values are usually nice but sometimes mean
example: picky wagtails
example: spider men & spider women
example: copper-tolerant browntop bent
4
The standard error SE
example: drought stress in sorghum
general consideration
Mean values are usually nice but sometimes mean
Mean and SD. . .
characterize data well if the distribution is bell-shaped
Mean values are usually nice but sometimes mean
Mean and SD. . .
characterize data well if the distribution is bell-shaped
and must be interpreted with caution in other cases
Mean values are usually nice but sometimes mean
Mean and SD. . .
characterize data well if the distribution is bell-shaped
and must be interpreted with caution in other cases
We will exemplify this with textbook examples from ecology, see
e.g.
M. Begon, C. R. Townsend, and J. L. Harper.
Ecology: From Individuals to Ecosystems.
Blackell Publishing, 4 edition, 2008.
Mean values are usually nice but sometimes mean
Mean and SD. . .
characterize data well if the distribution is bell-shaped
and must be interpreted with caution in other cases
We will exemplify this with textbook examples from ecology, see
e.g.
M. Begon, C. R. Townsend, and J. L. Harper.
Ecology: From Individuals to Ecosystems.
Blackell Publishing, 4 edition, 2008.
When original data were not available, we generated similar data
sets by computer simulation. So do not believe all data points.
Mean values are usually nice but sometimes mean
example: picky wagtails
Contents
1
Histograms: Densities or Numbers?
2
Computing σ with n or n − 1?
3
Mean values are usually nice but sometimes mean
example: picky wagtails
example: spider men & spider women
example: copper-tolerant browntop bent
4
The standard error SE
example: drought stress in sorghum
general consideration
Mean values are usually nice but sometimes mean
example: picky wagtails
Wagtails eat dung flies
Predator
Prey
White Wagtail
Motacilla alba alba
Dung Fly
Scatophaga stercoraria
image (c) by Artur Mikołajewski
image (c) by Viatour Luc
Mean values are usually nice but sometimes mean
example: picky wagtails
Conjecture
Size of flies varies.
efficiency for wagtail = energy gain / time to capture and eat
lab experiments show that efficiency is maximal when flies
have size 7mm
N.B. Davies.
Prey selection and social behaviour in wagtails (Aves:
Motacillidae).
J. Anim. Ecol., 46:37–57, 1977.
Mean values are usually nice but sometimes mean
example: picky wagtails
100
50
0
number
150
available dung flies
4
5
6
7
8
length [mm]
9
10
11
Mean values are usually nice but sometimes mean
example: picky wagtails
50
100
mean= 7.99
0
number
150
available dung flies
4
5
6
7
8
length [mm]
9
10
11
Mean values are usually nice but sometimes mean
example: picky wagtails
150
100
sd= 0.96
50
mean= 7.99
0
number
available dung flies
4
5
6
7
8
length [mm]
9
10
11
Mean values are usually nice but sometimes mean
example: picky wagtails
40
30
20
10
0
number
50
60
captured dung flies
4
5
6
7
8
length [mm]
9
10
11
Mean values are usually nice but sometimes mean
example: picky wagtails
captured dung flies
40
30
20
10
0
number
50
60
mean= 6.79
4
5
6
7
8
length [mm]
9
10
11
Mean values are usually nice but sometimes mean
example: picky wagtails
captured dung flies
10
20
30
40
sd= 0.69
0
number
50
60
mean= 6.79
4
5
6
7
8
length [mm]
9
10
11
Mean values are usually nice but sometimes mean
example: picky wagtails
0.5
dung flies: available, captured
0.1
0.2
0.3
available
0.0
fraction per mm
0.4
captured
4
5
6
7
8
length [mm]
9
10
11
Mean values are usually nice but sometimes mean
example: picky wagtails
numerical comparison of size distributions
captured
available
mean
0.5
dung flies: available, captured
0.1
0.2
0.3
available
0.0
fraction per mm
0.4
captured
4
5
6
7
8
length [mm]
9
10
11
Mean values are usually nice but sometimes mean
example: picky wagtails
numerical comparison of size distributions
captured
mean
available
<
0.5
dung flies: available, captured
0.1
0.2
0.3
available
0.0
fraction per mm
0.4
captured
4
5
6
7
8
length [mm]
9
10
11
Mean values are usually nice but sometimes mean
example: picky wagtails
numerical comparison of size distributions
captured
mean
6.29
<
available
7.99
0.5
dung flies: available, captured
0.1
0.2
0.3
available
0.0
fraction per mm
0.4
captured
4
5
6
7
8
length [mm]
9
10
11
Mean values are usually nice but sometimes mean
example: picky wagtails
numerical comparison of size distributions
captured
mean
6.29
<
sd
available
7.99
0.5
dung flies: available, captured
0.1
0.2
0.3
available
0.0
fraction per mm
0.4
captured
4
5
6
7
8
length [mm]
9
10
11
Mean values are usually nice but sometimes mean
example: picky wagtails
numerical comparison of size distributions
captured
mean
6.29
<
sd
<
available
7.99
0.5
dung flies: available, captured
0.1
0.2
0.3
available
0.0
fraction per mm
0.4
captured
4
5
6
7
8
length [mm]
9
10
11
Mean values are usually nice but sometimes mean
example: picky wagtails
numerical comparison of size distributions
captured
mean
6.29
<
sd
0.69
<
available
7.99
0.96
0.5
dung flies: available, captured
0.1
0.2
0.3
available
0.0
fraction per mm
0.4
captured
4
5
6
7
8
length [mm]
9
10
11
Mean values are usually nice but sometimes mean
example: picky wagtails
Interpretation
The birds prefer dung-flies from a relatively narrow range around
the predicted optimum of 7mm.
Mean values are usually nice but sometimes mean
example: picky wagtails
Interpretation
The birds prefer dung-flies from a relatively narrow range around
the predicted optimum of 7mm.
The distributions in this example were bell-shaped, and the 4
numbers (means and standard deviations) were appropriate to
summarize the data.
Mean values are usually nice but sometimes mean
example: spider men & spider women
Contents
1
Histograms: Densities or Numbers?
2
Computing σ with n or n − 1?
3
Mean values are usually nice but sometimes mean
example: picky wagtails
example: spider men & spider women
example: copper-tolerant browntop bent
4
The standard error SE
example: drought stress in sorghum
general consideration
Mean values are usually nice but sometimes mean
example: spider men & spider women
Nephila madagascariensis
image (c) by Bernard Gagnon
Mean values are usually nice but sometimes mean
Simulated Data:
70 sampled spiders
mean size: 21.05 mm
sd of size :12.94 mm
example: spider men & spider women
Mean values are usually nice but sometimes mean
example: spider men & spider women
3
2
1
0
Frequency
4
5
6
?????
0
10
20
30
size [mm]
40
50
Mean values are usually nice but sometimes mean
example: spider men & spider women
8
6
4
2
0
Frequency
10
12
14
Nephila madagascariensis (n=70)
0
10
20
30
size [mm]
40
50
Mean values are usually nice but sometimes mean
example: spider men & spider women
12
14
Nephila madagascariensis (n=70)
8
6
4
2
0
Frequency
10
mean= 21.06
0
10
20
30
size [mm]
40
50
Mean values are usually nice but sometimes mean
example: spider men & spider women
males
2
4
6
8
females
0
Frequency
10
12
14
Nephila madagascariensis (n=70)
0
10
20
30
size [mm]
40
50
Mean values are usually nice but sometimes mean
example: spider men & spider women
males
2
4
6
8
females
0
Frequency
10
12
14
Nephila madagascariensis (n=70)
0
10
20
30
size [mm]
40
50
Mean values are usually nice but sometimes mean
example: spider men & spider women
Conclusion from spider example
If data comes from different groups, it may be reasonable to
compute mean an sd separately for each group.
Mean values are usually nice but sometimes mean
example: copper-tolerant browntop bent
Contents
1
Histograms: Densities or Numbers?
2
Computing σ with n or n − 1?
3
Mean values are usually nice but sometimes mean
example: picky wagtails
example: spider men & spider women
example: copper-tolerant browntop bent
4
The standard error SE
example: drought stress in sorghum
general consideration
Mean values are usually nice but sometimes mean
example: copper-tolerant browntop bent
Copper Tolerance in Browntop Bent
Browntop Bent
Agrostis tenuis
Copper
Cuprum
image (c) Kristian Peters
Hendrick met de Bles
Mean values are usually nice but sometimes mean
example: copper-tolerant browntop bent
A.D. Bradshaw.
Population Differentiation in agrostis tenius Sibth. III.
populations in varied environments.
New Phytologist, 59(1):92 – 103, 1960.
T. McNeilly and A.D Bradshaw.
Evolutionary Processes in Populations of Copper Tolerant
Agrostis tenuis Sibth.
Evolution, 22:108–118, 1968.
Mean values are usually nice but sometimes mean
example: copper-tolerant browntop bent
A.D. Bradshaw.
Population Differentiation in agrostis tenius Sibth. III.
populations in varied environments.
New Phytologist, 59(1):92 – 103, 1960.
T. McNeilly and A.D Bradshaw.
Evolutionary Processes in Populations of Copper Tolerant
Agrostis tenuis Sibth.
Evolution, 22:108–118, 1968.
Again, we have no access to original data and use simulated
data.
Mean values are usually nice but sometimes mean
example: copper-tolerant browntop bent
Adaptation to copper?
root length indicates copper tolerance
Mean values are usually nice but sometimes mean
example: copper-tolerant browntop bent
Adaptation to copper?
root length indicates copper tolerance
measure root lengths of plants near copper mine
Mean values are usually nice but sometimes mean
example: copper-tolerant browntop bent
Adaptation to copper?
root length indicates copper tolerance
measure root lengths of plants near copper mine
take seeds from clean meadow and sow near copper mine
Mean values are usually nice but sometimes mean
example: copper-tolerant browntop bent
Adaptation to copper?
root length indicates copper tolerance
measure root lengths of plants near copper mine
take seeds from clean meadow and sow near copper mine
measure root length of these “meadow plants” in copper
environment
Mean values are usually nice but sometimes mean
example: copper-tolerant browntop bent
100
Browntop Bent (n=50)
60
40
20
0
density per cm
80
Copper Mine Grass
0
50
100
root length (cm)
150
200
Mean values are usually nice but sometimes mean
example: copper-tolerant browntop bent
10
20
Grass seeds from a meadow
0
density per cm
30
40
Browntop Bent (n=50)
0
50
100
root length (cm)
150
200
Mean values are usually nice but sometimes mean
example: copper-tolerant browntop bent
10
20
Grass seeds from a meadow
copper tolerant ?
0
density per cm
30
40
Browntop Bent (n=50)
0
50
100
root length (cm)
150
200
Mean values are usually nice but sometimes mean
example: copper-tolerant browntop bent
0.02
0.03
0.04
meadow plants
0.01
copper mine plants
0.00
density per cm
0.05
0.06
0.07
Browntop Bent (n=50)
0
50
100
root length (cm)
150
200
Mean values are usually nice but sometimes mean
example: copper-tolerant browntop bent
100
Browntop Bent (n=50)
copper mine plants
m+s
20
40
60
m
0
density per cm
80
m−s
0
50
100
root length (cm)
150
200
Mean values are usually nice but sometimes mean
example: copper-tolerant browntop bent
40
Browntop Bent (n=50)
m−s
20
30
meadow plants
0
10
density per cm
m+s
m
0
50
100
150
root length (cm)
2/3 of the data within [m-sd,m+sd]???? No!
200
Mean values are usually nice but sometimes mean
example: copper-tolerant browntop bent
Browntop Bent n=50+50
copper mine plants
●
●
meadow plants
●
●
0
●
●
● ●●
50
●
●●
●
●●
●
100
root length (cm)
●
150
●
200
Mean values are usually nice but sometimes mean
example: copper-tolerant browntop bent
quartiles of root length [cm]
min
copper adapted 12.9
from meadow
1.1
Q1 median
80.1 100.8
13.2
16.0
Q3
120.9
19.6
max
188.9
218.9
Mean values are usually nice but sometimes mean
example: copper-tolerant browntop bent
Conclusion from browntop bent example
Sometimes the two numbers
m and sd
give not enough information.
In this example the four quartiles
max, Q1, median, Q3, max
that are shown in the boxplot are more
approriate.
Mean values are usually nice but sometimes mean
example: copper-tolerant browntop bent
Conclusions from this section
Always visually inspect the data!
Mean values are usually nice but sometimes mean
example: copper-tolerant browntop bent
Conclusions from this section
Always visually inspect the data!
Never rely on summarising values alone!
The standard error SE
Contents
1
Histograms: Densities or Numbers?
2
Computing σ with n or n − 1?
3
Mean values are usually nice but sometimes mean
example: picky wagtails
example: spider men & spider women
example: copper-tolerant browntop bent
4
The standard error SE
example: drought stress in sorghum
general consideration
The standard error SE
The Standard Error
sd
SE = √
n
describes the variability of the sample mean.
n: sample size
sd: sample standard deviance
The standard error SE
example: drought stress in sorghum
Contents
1
Histograms: Densities or Numbers?
2
Computing σ with n or n − 1?
3
Mean values are usually nice but sometimes mean
example: picky wagtails
example: spider men & spider women
example: copper-tolerant browntop bent
4
The standard error SE
example: drought stress in sorghum
general consideration
The standard error SE
example: drought stress in sorghum
drought stress in sorghum
V. Beyel and W. Brüggemann.
Differential inhibition of photosynthesis during pre-flowering
drought stress in Sorghum bicolor genotypes with different
senescence traits.
Physiologia Plantarum, 124:249–259, 2005.
14 sorghum plants were not watered for 7 days.
in the last 3 days: transpiration was measured for each
plant (mean over 3 days)
the area of the leaves of each plant was determined
The standard error SE
example: drought stress in sorghum
transpiration rate
=
(amount of water per day)/area of leaves
ml
cm2 · day
Aim: Determine mean transpiration rate µ under
these conditions.
The standard error SE
example: drought stress in sorghum
If we hade many plants, we could determine
µ quite precisely.
The standard error SE
example: drought stress in sorghum
If we hade many plants, we could determine
µ quite precisely.
Problem: How accurate is the estimation of
µ with such a small sample? (n = 14)
The standard error SE
example: drought stress in sorghum
4
3
2
1
0
frequency
5
6
drought stressed sorghum (variety B, n = 14)
0.08
0.10
0.12
0.14
Transpiration (ml/(day*cm^2))
0.16
The standard error SE
example: drought stress in sorghum
6
drought stressed sorghum (variety B, n = 14)
4
3
2
1
0
frequency
5
mean=0.117
0.08
0.10
0.12
0.14
Transpiration (ml/(day*cm^2))
0.16
The standard error SE
example: drought stress in sorghum
Standard Deviation=0.026
4
3
2
1
0
frequency
5
6
drought stressed sorghum (variety B, n = 14)
0.08
0.10
0.12
0.14
Transpiration (ml/(day*cm^2))
0.16
The standard error SE
example: drought stress in sorghum
transpiration data: x1, x2, . . . , x14
x = x1 + x2 + · · · + x14 /14
The standard error SE
example: drought stress in sorghum
transpiration data: x1, x2, . . . , x14
14
x = x1 + x2 + · · · + x14
1 X
xi
/14 =
14
i=1
The standard error SE
example: drought stress in sorghum
transpiration data: x1, x2, . . . , x14
14
x = x1 + x2 + · · · + x14
1 X
xi
/14 =
14
i=1
x = 0.117
The standard error SE
example: drought stress in sorghum
our estimation:
µ ≈ 0.117
The standard error SE
example: drought stress in sorghum
our estimation:
µ ≈ 0.117
how accurate is this estimation?
The standard error SE
example: drought stress in sorghum
our estimation:
µ ≈ 0.117
how accurate is this estimation?
How much does x (our estimation)
deviate from µ (the true mean value)?
The standard error SE
general consideration
Contents
1
Histograms: Densities or Numbers?
2
Computing σ with n or n − 1?
3
Mean values are usually nice but sometimes mean
example: picky wagtails
example: spider men & spider women
example: copper-tolerant browntop bent
4
The standard error SE
example: drought stress in sorghum
general consideration
The standard error SE
general consideration
Assume we had made the experiment
not just 14 times,
The standard error SE
general consideration
Assume we had made the experiment
not just 14 times,
but repeated it 100 times,
The standard error SE
general consideration
Assume we had made the experiment
not just 14 times,
but repeated it 100 times,
1000 times,
The standard error SE
general consideration
Assume we had made the experiment
not just 14 times,
but repeated it 100 times,
1000 times,
1000000 times
The standard error SE
general consideration
We consider our 14 plants as
random sample
from a very large population
of possible values.
The standard error SE
general consideration
population
(all rates of transpiration)
n= oo
The standard error SE
general consideration
population
(all rates of transpiration)
n= oo
sample
n=14
The standard error SE
general consideration
population
(all rates of transpiration)
µ
sample
x
The standard error SE
general consideration
We estimate
the population mean
µ
by the sample mean
x.
The standard error SE
general consideration
Each new sample gives a new value of x.
The standard error SE
general consideration
Each new sample gives a new value of x.
x depends on randomness:
it is a random variable
The standard error SE
general consideration
Each new sample gives a new value of x.
x depends on randomness:
it is a random variable
Problem: How variable is x?
The standard error SE
general consideration
Each new sample gives a new value of x.
x depends on randomness:
it is a random variable
Problem: How variable is x?
More precisely: What is the typical deviation
of x from µ?
The standard error SE
general consideration
x = x1 + x2 + · · · + xn /n
What does the variability of x depend on?
The standard error SE
general consideration
1. From the variability of the single
observations x1, x2, . . . , xn
The standard error SE
general consideration
x varies a lot
0.05
0.10
0.15
0.20
0.25
x varies little
0.05
0.10
0.15
0.20
0.25
The standard error SE
general consideration
mean = 0.117
x varies a lot
⇒ x varies a lot
0.05
0.10
0.15
0.20
0.25
mean = 0.117
x varies little
⇒ x varies little
0.05
0.10
0.15
0.20
0.25
The standard error SE
general consideration
2.
from the sample size
n
The standard error SE
general consideration
2.
from the sample size
n
The larger
n,
the smaller is
the variability of
x.
The standard error SE
general consideration
To explore this dependency we perform a
(Computer-)Experiment.
The standard error SE
general consideration
Experiment: Take a population,
draw samples
and examine how x varies.
The standard error SE
general consideration
We assume the distribtion of possible
transpriration rates looks like this:
The standard error SE
general consideration
10
5
0
density
15
hypothetical distribution of transpiration rates
0.06 0.08 0.10 0.12 0.14 0.16 0.18 0.20
Transpiration (ml/(day*cm^2))
The standard error SE
general consideration
hypothetical distribution of transpiration rates
10
5
0
density
15
mean=0.117
0.06 0.08 0.10 0.12 0.14 0.16 0.18 0.20
Transpiration (ml/(day*cm^2))
The standard error SE
general consideration
hypothetical distribution of transpiration rates
10
5
0
density
15
SD=0.026
0.06 0.08 0.10 0.12 0.14 0.16 0.18 0.20
Transpiration (ml/(day*cm^2))
The standard error SE
general consideration
At first with small sample sizes:
n=4
The standard error SE
general consideration
10
5
0
Dichte
15
sample of size 4
0.06 0.08 0.10 0.12 0.14 0.16 0.18 0.20
Transpiration (ml/(Tag*cm^2))
The standard error SE
general consideration
10
5
0
Dichte
15
second sample of size 4
0.06 0.08 0.10 0.12 0.14 0.16 0.18 0.20
Transpiration (ml/(Tag*cm^2))
The standard error SE
general consideration
10
5
0
Dichte
15
third sample of size 4
0.06 0.08 0.10 0.12 0.14 0.16 0.18 0.20
Transpiration (ml/(Tag*cm^2))
The standard error SE
general consideration
10 samples
The standard error SE
general consideration
0
2
4
6
8
10
10 samples
0.06 0.08 0.10 0.12 0.14 0.16 0.18 0.20
Transpiration (ml/(Tag*cm^2))
The standard error SE
general consideration
50 samples
The standard error SE
general consideration
0
10
20
30
40
50
50 samples
0.06 0.08 0.10 0.12 0.14 0.16 0.18 0.20
Transpiration (ml/(Tag*cm^2))
The standard error SE
general consideration
How variable are
the sample means?
The standard error SE
general consideration
0
2
4
6
8
10
10 samples of size 4
0.06 0.08 0.10 0.12 0.14 0.16 0.18 0.20
Transpiration (ml/(Tag*cm^2))
The standard error SE
general consideration
0
2
4
6
8
10
10 samples of size 4 and the corresponding sample
means
0.06 0.08 0.10 0.12 0.14 0.16 0.18 0.20
Transpiration (ml/(Tag*cm^2))
The standard error SE
general consideration
0
10
20
30
40
50
50 samples of size 4
0.06 0.08 0.10 0.12 0.14 0.16 0.18 0.20
Transpiration (ml/(Tag*cm^2))
The standard error SE
general consideration
0
10
20
30
40
50
50 samples of size 4 and the corresponding sample
means
0.06 0.08 0.10 0.12 0.14 0.16 0.18 0.20
Transpiration (ml/(Tag*cm^2))
The standard error SE
general consideration
Sample mean
Mean=0.117
SD=0.013
10
20
Population
Mean=0.117
SD=0.026
0
Density
30
40
distribution of sample means (sample size
n = 4)
0.06 0.08 0.10 0.12 0.14 0.16 0.18 0.20
Transpiration (ml/(day*cm^2))
The standard error SE
general consideration
population:
standard deviation = 0.026
The standard error SE
general consideration
population:
standard deviation = 0.026
sample means (n = 4):
standard deviation = 0.013
The standard error SE
general consideration
population:
standard deviation = 0.026
sample means (n = 4):
standard deviation = 0.013 √
= 0.026/ 4
The standard error SE
general consideration
Increase
the sample size
from
4
to
16
The standard error SE
general consideration
0
2
4
6
8
10
10 samples of size 16 and the
corresponding sample means
0.06 0.08 0.10 0.12 0.14 0.16 0.18 0.20
Transpiration (ml/(day*cm^2))
The standard error SE
general consideration
0
10
20
30
40
50
50 samples of size 16 and the
corresponding sample means
0.06 0.08 0.10 0.12 0.14 0.16 0.18 0.20
Transpiration (ml/(day*cm^2))
The standard error SE
general consideration
Population
Mean=0.117
SD=0.026
Sample mean
Mean=0.117
SD=0.0064
40
20
0
Density
60
80
distribution of sample means (sample size
n = 16)
0.06 0.08 0.10 0.12 0.14 0.16 0.18 0.20
Transpiration (ml/(day*cm^2))
The standard error SE
general consideration
population:
standard deviation = 0.026
The standard error SE
general consideration
population:
standard deviation = 0.026
sample mean (n = 16):
standard deviation = 0.0065
The standard error SE
general consideration
population:
standard deviation = 0.026
sample mean (n = 16):
standard deviation = 0.0065√
= 0.026/ 16
The standard error SE
general consideration
General Rule
Let x be the mean of a sample of size n from a distribution (e.g.
all values in a population) with standard deviation σ. Since x
depends on the random sample, it is a random variable. Its
standard deviation σx fulfills
σ
σx = √ .
n
The standard error SE
general consideration
General Rule
Let x be the mean of a sample of size n from a distribution (e.g.
all values in a population) with standard deviation σ. Since x
depends on the random sample, it is a random variable. Its
standard deviation σx fulfills
σ
σx = √ .
n
Problem: σ is unknown
The standard error SE
general consideration
General Rule
Let x be the mean of a sample of size n from a distribution (e.g.
all values in a population) with standard deviation σ. Since x
depends on the random sample, it is a random variable. Its
standard deviation σx fulfills
σ
σx = √ .
n
Problem: σ is unknown
Idea: Estimate σ by sample standard deviation s:
σ≈s
The standard error SE
σ
σx = √
n
general consideration
The standard error SE
general consideration
σ
s
σx = √ ≈ √
n
n
The standard error SE
general consideration
σ
s
σx = √ ≈ √ =: SEM
n
n
SEM stands for Standard Error of the Mean, or Standard Error
for short.
The standard error SE
general consideration
The distribution of x
Observation
Even if the distribution of x is asymmetric and has multiple
peaks,
The standard error SE
general consideration
The distribution of x
Observation
Even if the distribution of x is asymmetric and has multiple
peaks, the distribution of x will be bell-shaped
The standard error SE
general consideration
The distribution of x
Observation
Even if the distribution of x is asymmetric and has multiple
peaks, the distribution of x will be bell-shaped
(at least for larger sample sizes n.)
The standard error SE
general consideration
The standard error SE
µ−σ
general consideration
µ
µ+σ
The standard error SE
general consideration
µ
µ−σ
µ−
√σ
n
µ+σ
µ+
√σ
n
The standard error SE
general consideration
µ
µ−σ
µ−
√σ
n
µ+σ
µ+
√σ
n
The standard error SE
general consideration
√
√ Pr(x ∈ [µ − σ/ n), µ + (σ/ n)] ≈
µ
µ−σ
µ−
√σ
n
2
3
µ+σ
µ+
√σ
n
The standard error SE
general consideration
The standard error SE
general consideration
x
The standard error SE
x−
√s
n
general consideration
x
x+
√s
n
The standard error SE
x−
√s
n
general consideration
x µ
x+
√s
n
The standard error SE
general consideration
The distribution of x
is approximately
of a certain shape:
the normal distribution.
The standard error SE
general consideration
Density of the normal distribution
0.4
0.1
0.2
0.3
µ+σ
0.0
Normaldichte
µ
µ+σ
−1
0
1
2
3
4
5
The standard error SE
general consideration
Density of the normal distribution
0.4
0.2
0.3
µ+σ
0.0
0.1
Normaldichte
µ
µ+σ
−1
0
1
2
3
4
5
The normal distribution is also called Gauß distribution
The standard error SE
general consideration
Density of the normal distribution
The normal distribution is also called Gauß distribution
(after Carl Friedrich Gauß, 1777-1855)
Related documents