Download Sampling Distributions - McMaster University, Canada

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
Sociology 6Z03
Topic 11: Sampling Distributions
John Fox
McMaster University
Fall 2016
John Fox (McMaster University)
Soc 6Z03: Sampling Distributions
Fall 2016
1 / 42
Fall 2016
2 / 42
Outline: Sampling Distributions
Introduction
Sampling Variation
Bias and Variability
The Sampling Distribution of Sample Means: Tossing Dice
The Sampling Distribution of Sample Means: Theory
The Sampling Distribution of Sample Means: Simulation (time permitting)
John Fox (McMaster University)
Soc 6Z03: Sampling Distributions
Introduction
Statistical Inference
Statistical inference is the process of drawing conclusions about a population based on a
smaller sample drawn at random from the population.
Statistical inference also applies to the results of randomized comparative experiments, but
because the reasoning underlying inference is simpler in the context of sampling from a
population, I’ll concentrate on that setting.
The central concept in classical statistical inference is the notion of a sampling
distribution, which describes how sample results behave if we draw repeated samples of a
particular size n from a larger population.
In any real application of statistical inference, we draw only one sample of size n, but the
possibility of sampling repeatedly — if only in principle — provides the conceptual
foundation for deciding how informative our sample is about the population.
John Fox (McMaster University)
Soc 6Z03: Sampling Distributions
Fall 2016
3 / 42
Introduction
Parameters and Statistics
A parameter is a number that describes some aspect of the population.
For example, the average household income µ in the population of Ontario is a parameter.
Suppose, for argument sake, that this number is µ = $72, 734. (Actually, this is the median
family income in Ontario from the 2006 Census.)
In a real application, parameters are generally not known.
A statistic is a number that can be calculated from the sample data, without any
knowledge of population parameters.
Suppose that we draw a random sample of n = 1000 Ontario families, and that the average
income in these families is $73, 422.
The sample mean, x = $73, 422, is a statistic.
John Fox (McMaster University)
Soc 6Z03: Sampling Distributions
Fall 2016
4 / 42
Introduction
Estimation and Hypothesis Testing
We are usually interested in statistics not for themselves, but because they can tell us
something about the population.
For example, we might want to use the sample mean family income to estimate the
(unknown) population mean income.
Alternatively, we might want to use the sample mean to determine whether average family
income in the population of Ontario has changed since the 2006 Census.
These two sorts of applications lead to the two classical modes of statistical inference:
estimation and hypothesis testing.
John Fox (McMaster University)
Soc 6Z03: Sampling Distributions
Fall 2016
5 / 42
Sampling Variation
A key fact about sample statistics is that they are random variables that vary from sample
to sample.
For example, if we were to select another random sample of 1000 Ontario families, it is
highly unlikely that we would again get a sample mean income of exactly $73, 422.
The variation of a sample statistic from one sample to the next is called sampling
variation or sampling variability.
When the sampling variation of a statistic is very large, the sample contains little information
about the value of a population parameter.
But when sampling variation is small, the sample statistic is informative about the
parameter, even though it is very unlikely that the statistic will be exactly equal to the
parameter in a given sample.
I’ll make these ideas more precise presently.
John Fox (McMaster University)
Soc 6Z03: Sampling Distributions
Fall 2016
6 / 42
Sampling Variation
Thought Question
Would you expect the sampling variability of a statistic to be larger in a small sample
or a large sample?
A Sampling variability is larger in a small sample.
B Sampling variability is larger in a large sample.
C Sampling variability is the same in small and large samples.
D I don’t know.
John Fox (McMaster University)
Soc 6Z03: Sampling Distributions
Fall 2016
7 / 42
Sampling Variation
The Law of Large Numbers
Definition
The Law of Large Numbers: Suppose that we draw observations at random from a population
with mean µ, and recalculate the sample mean x every time an observation is added to our
sample. As the sample size grows, x tends to get closer and closer to µ.
John Fox (McMaster University)
Soc 6Z03: Sampling Distributions
Fall 2016
8 / 42
Sampling Variation
I’ll illustrate sampling variability in three ways:
1
By drawing 50 repeated samples of individuals in the class, each of size n = 4;
ascertaining the height (in inches) of each individual in the sample; and calculating the
average height of the people in the sample.
2
By enumerating all possible samples in a setting where the number of distinct samples is
relatively small. I will calculate the sample mean x for each sample, and generate the
probability distribution for x.
3
By using the computer to “simulate” drawing repeated samples from a very large
population, calculating the mean for each sample, and then examining the sampling
distribution of the sample means (time permitting).
John Fox (McMaster University)
Soc 6Z03: Sampling Distributions
Fall 2016
9 / 42
Sampling Variation
Sampling From the Population of the Class
After calculating the average height in each of 50 samples, we will look at the distribution
of the 50 averages.
We will also examine the population distribution of heights in the class, and calculate the
average height in the class.
In this case, the mean height µ in the class is the parameter, while the mean height x in
each sample is the statistic.
John Fox (McMaster University)
Soc 6Z03: Sampling Distributions
Fall 2016
10 / 42
Sampling Variation
Sampling From the Population of the Class
Notice that there are three distinct distributions here:
1
The population distribution: The distribution of heights in the population.
The mean of this distribution is µ.
2
The distribution in the sample: The distribution of heights in a particular sample of size
n = 4.
The observations are x1 , x2 , x3 , and x4 , and the mean of the sample distribution is x.
3
The sampling distribution of the sample means: The distribution of x across the 50
repeated samples.
John Fox (McMaster University)
Soc 6Z03: Sampling Distributions
Fall 2016
11 / 42
Sampling Variation
Sampling From the Population of the Class
The important distribution for statistical inference is the sampling distribution.
Definition
The sampling distribution of a statistic is the distribution of the statistic in all possible samples
of the same size drawn from a population.
When we select 50 samples, the resulting distribution of the statistic only approximates
the true sampling distribution.
Keeping these three distributions separate (i.e., the population distribution, the
distribution in the sample, and the sampling distribution) is one of the keys to
understanding statistical inference.
John Fox (McMaster University)
Soc 6Z03: Sampling Distributions
Fall 2016
12 / 42
Bias and Variability
If we use a statistic, such as the sample mean x, to estimate a parameter, such as the
population mean µ, we want the estimate to be as close as possible to the parameter.
There are two factors that tend to make the statistic different from the parameter: bias
and variability.
Ideally, therefore, we want both bias and variability to be small.
John Fox (McMaster University)
Soc 6Z03: Sampling Distributions
Fall 2016
13 / 42
Bias and Variability
Bias
Definition
The bias of a statistic is the difference between its average value (from its sampling
distribution) and the value of the parameter.
A statistic used to estimate a parameter is unbiased if the mean of its sampling
distribution is equal to the parameter being estimated.
An unbiased statistic will sometimes give an estimate that’s too large, and sometimes too
small; but on average, over many samples, it yields just the right value.
John Fox (McMaster University)
Soc 6Z03: Sampling Distributions
Fall 2016
14 / 42
Bias and Variability
Variability
Definition
Variability: The average value of the statistic doesn’t tell the whole story, however, because
the values of the statistic will vary around this average.
If the variability of the statistic is small, then our estimate will be stable.
In the case of an unbiased statistic, small variability means that we will likely get an
estimate that is very close to the parameter.
John Fox (McMaster University)
Soc 6Z03: Sampling Distributions
Fall 2016
15 / 42
Bias and Variability
Statistical estimation is like shooting at a target, thinking of the centre of the target as
the parameter and the individual shots as estimates:
John Fox (McMaster University)
Soc 6Z03: Sampling Distributions
Fall 2016
16 / 42
Bias and Variability
Thought Question
Which of the following patterns of shots represents an unbiased gun (or guns)?
A A.
B B.
C C.
D A and B.
E A and C.
A
John Fox (McMaster University)
B
C
Soc 6Z03: Sampling Distributions
Fall 2016
17 / 42
Fall 2016
18 / 42
Bias and Variability
Thought Question
Which of the guns has (have) relatively low variance?
A A.
B B.
C C.
D A and B.
E A and C.
A
John Fox (McMaster University)
B
Soc 6Z03: Sampling Distributions
C
Bias and Variability
Mean Squared Error
Definition
The mean-squared error (MSE) of a statistic — literally, the average squared mistake in
estimation — is the sum of squared bias and variance:
MSE = Bias2 + Variance
John Fox (McMaster University)
Soc 6Z03: Sampling Distributions
Fall 2016
19 / 42
Fall 2016
20 / 42
Bias and Variability
Thought Question
Which of the guns has (have) the smallest mean-squared error?
A A.
B B.
C C.
D A and B.
E A and C.
A
John Fox (McMaster University)
B
Soc 6Z03: Sampling Distributions
C
The Sampling Distribution of Sample Means: Tossing Dice
We toss n fair dice; observe the number of dots (called “pips” xi showing for each die;
and calculate the mean number of dots x = ∑ni=1 xi /n.
If, for instance, n = 2 and the sample is x1 = 2, x2 = 3 then x = (2 + 3)/2 = 5/2 = 2.5.
Recall that there are three distributions that we must keep separate in our minds:
1
2
3
The distribution of X in the population.
The distribution of x in a particular sample.
The sampling distribution of sample means, X , across all possible samples.
John Fox (McMaster University)
Soc 6Z03: Sampling Distributions
Fall 2016
21 / 42
The Sampling Distribution of Sample Means: Tossing Dice
The Population Distribution
In this case, the population is infinite, and X has the following probability distribution:
xi
1
6
1
6
1
6
1
6
1
6
1
6
1
6
sum
1
2
3
4
5
John Fox (McMaster University)
pi
Soc 6Z03: Sampling Distributions
Fall 2016
22 / 42
The Sampling Distribution of Sample Means: Tossing Dice
The Population Distribution
The mean of this distribution is
µ=
∑ xi pi = 3.5
The standard deviation is
σ=
q
∑(xi − µ)2 pi = 1.708
We can also think of this as the sampling distribution of means from samples of size
n = 1.
John Fox (McMaster University)
Soc 6Z03: Sampling Distributions
Fall 2016
23 / 42
Fall 2016
24 / 42
The Sampling Distribution of Sample Means: Tossing Dice
The Distribution of a Particular Sample
For example, suppose that we roll the values x1 = 2, x2 = 3.
Thought Question
What is the value of the sample mean x for this sample?
A 2.
B 3.
C 2.5.
D I don’t know.
John Fox (McMaster University)
Soc 6Z03: Sampling Distributions
The Sampling Distribution of Sample Means: Tossing Dice
The Sampling Distribution of the Mean Across All Possible Samples
When n = 2, there are only 6 × 6 = 36 possible samples, so we can enumerate them all,
along with their sample means:
sample
1
2
3
4
5
6
7
8
9
x1
1
1
2
1
2
3
1
2
3
John Fox (McMaster University)
x2
1
2
1
3
2
1
4
3
2
x
1.0
1.5
1.5
2.0
2.0
2.0
2.5
2.5
2.5
sample
10
11
12
13
14
15
16
17
18
x1
4
1
2
3
4
5
1
2
3
x2
1
5
4
3
2
1
6
5
4
x
2.5
3.0
3.0
3.0
3.0
3.0
3.5
3.5
3.5
Soc 6Z03: Sampling Distributions
Fall 2016
25 / 42
Fall 2016
26 / 42
The Sampling Distribution of Sample Means: Tossing Dice
The Sampling Distribution of the Mean Across All Possible Samples
sample
19
20
21
22
23
24
25
26
27
John Fox (McMaster University)
x1
4
5
6
2
3
4
5
6
3
x2
3
2
1
6
5
4
3
2
6
x
3.5
3.5
3.5
4.0
4.0
4.0
4.0
4.0
4.5
sample
28
29
30
31
32
33
34
35
36
Soc 6Z03: Sampling Distributions
x1
4
5
6
4
5
6
5
6
6
x2
5
4
3
6
5
4
6
5
6
x
4.5
4.5
4.5
5.0
5.0
5.0
5.5
5.5
6.0
The Sampling Distribution of Sample Means: Tossing Dice
The Sampling Distribution of the Mean Across All Possible Samples
Each of these 36 samples occurs with equal probability, 1/36, producing the following
sampling distribution for x:
xi
1.0
1.5
2.0
2.5
3.0
3.5
4.0
4.5
5.0
5.5
6.0
sum
John Fox (McMaster University)
pi
36
36
1
36
2
36
3
36
4
36
5
36
6
36
5
36
4
36
3
36
2
36
1
36
=1
Soc 6Z03: Sampling Distributions
Fall 2016
27 / 42
The Sampling Distribution of Sample Means: Tossing Dice
The Sampling Distribution of the Mean Across All Possible Samples
The sampling distribution has mean 3.5 (which is precisely equal to the population mean
µ) and standard deviation 1.208.
Thought Question
How does the standard deviation of the sample means compare to the standard
deviation σ of the population?
A The standard deviation of sample means is the same as σ.
B The standard deviation of sample means is smaller than σ.
C The standard deviation of sample means is larger than σ.
D I don’t know.
John Fox (McMaster University)
Soc 6Z03: Sampling Distributions
Fall 2016
28 / 42
The Sampling Distribution of Sample Means: Tossing Dice
Increasing the Sample Size to n = 3
When n = 3, there are 6 × 6 × 6 = 216 different samples, each of which is chosen with
probability 1/216:
sample
1
2
3
4
5
6
7
8
9
10
x1
1
1
1
2
1
1
1
2
2
3
x2
1
1
2
1
1
2
3
1
2
1
x3
1
2
1
1
3
2
1
2
1
1
John Fox (McMaster University)
x
1.00000
1.33333
1.33333
1.33333
1.66667
1.66667
1.66667
1.66667
1.66667
1.66667
x1
·
·
·
6
6
5
6
6
6
·
·
·
211
212
213
214
215
216
x2
·
·
·
5
6
6
5
6
6
x3
·
·
·
5
4
6
6
5
6
x
·
·
·
5.33333
5.33333
5.66667
5.66667
5.66667
6.00000
Soc 6Z03: Sampling Distributions
Fall 2016
29 / 42
The Sampling Distribution of Sample Means: Tossing Dice
Increasing the Sample Size to n = 3
The sampling distribution of the sample means has mean 3.5 and standard deviation
0.986.
The mean of x is still µ, but its standard deviation has gotten smaller.
The sampling distributions of X for n = 1, 2, and 3:
n=2
0.20
n=3
Probability
0.10
0.15
0.05
0.0
0.0
0.0
0.05
0.05
Probability
0.10
0.15
Probability
0.10
0.15
0.20
0.20
Population Distribution (n = 1)
1
2
3
4
x
John Fox (McMaster University)
5
6
1
2
3
_
x
4
5
Soc 6Z03: Sampling Distributions
6
1
2
3
_
x
4
5
Fall 2016
6
30 / 42
The Sampling Distribution of Sample Means: Tossing Dice
Thought Question
TRUE or FALSE: As the sample size grows, the sampling distribution of sample means
looks more and more like a normal distribution.
A TRUE.
B FALSE.
C I don’t know.
John Fox (McMaster University)
Soc 6Z03: Sampling Distributions
Fall 2016
31 / 42
The Sampling Distribution of Sample Means: Theory
General Properties
The following general results flow from these examples.
Suppose that x is the mean of a simple random sample drawn from a large population
with mean µ and standard deviation σ:
The mean of the sampling distribution of x is the population mean µ, so x is an unbiased
estimator of µ.
√
The standard deviation of the sampling distribution of x is approximately σ/ n.
The standard
√ deviation of x, therefore, declines as the sample size grows.
Because n (not n) is in the denominator of the standard deviation of x, to double the
precision of x, we need to multiply the sample size by a factor of 4.
The approximation is good enough if the population is at least 10 times the size of the sample.
The result is exact for an independent random sample (as opposed to an SRS).
John Fox (McMaster University)
Soc 6Z03: Sampling Distributions
Fall 2016
32 / 42
The Sampling Distribution of Sample Means: Theory
Checking Against the Examples
Let us verify that the result holds for the dice-tossing examples (where the samples are
independent draws from an infinite population):
For n = 2:
For n = 3:
John Fox (McMaster University)
√
σ
1.708
√ = √ = 1.208
n
2
√
σ
1.708
√ = √ = 0.986
n
3
Soc 6Z03: Sampling Distributions
Fall 2016
33 / 42
The Sampling Distribution of Sample Means: Theory
The Law of Large Numbers
Because the mean of the sampling distribution of X is µ and the standard deviation of X
goes to 0 as the sample size n grows, the sample mean x tends to get closer and closer to
the population mean µ.
In a very large sample, x almost certainly will be very close to µ.
Recall that this is called the law of large numbers: as n → ∞, x → µ.
John Fox (McMaster University)
Soc 6Z03: Sampling Distributions
Fall 2016
34 / 42
The Sampling Distribution of Sample Means: Theory
The Central Limit Theorem
Regardless of the shape of the population distribution of X , the sampling distribution of x
is approximately normal, so
√
x → N (µ, σ/ n )
with the approximation improving as the sample size grows.
This result is called the central limit theorem.
How large n needs to be for the approximation to be good enough depends upon how far
from normal the population distribution is, but n ≥ 100 almost always suffices.
If the population distribution of X is itself normal, then, regardless of n, the sampling
distribution of x is exactly normal.
John Fox (McMaster University)
Soc 6Z03: Sampling Distributions
Fall 2016
35 / 42
The Sampling Distribution of Sample Means: Simulation
Sampling From Normal Populations
I used a computer program to sample repeatedly from a normal population with mean
µ = 100 and standard deviation σ = 15.
Imagine that this is a population of individuals’ IQ scores.
I selected 1000 samples each of size n = 1, n = 2, and n = 10, calculating the mean x for
each sample, and then making a histogram of the distribution of the 1000 sample means.
Notice that the individual observations (n = 1), and the means for samples of size n = 2
and n = 10 are all normally distributed.
√
The theoretical normal densities, N (µ, σ/ n ), are superimposed on the histograms
John Fox (McMaster University)
Soc 6Z03: Sampling Distributions
Fall 2016
36 / 42
The Sampling Distribution of Sample Means: Simulation
0.08
Sampling From Normal Populations
0.0
0.02
Density
0.04
0.06
n=1
80
100
_
x
120
140
120
140
0.08
60
0.0
0.02
Density
0.04
0.06
n=2
60
80
100
_
x
n = 10, rescaled
0.0
0.0
0.02
0.02
Density
0.04 0.06
Density
0.04 0.06
0.08
0.08
n = 10
60
John Fox (McMaster University)
80
100
_
x
120
140
80
90
100_
x
110
120
Soc 6Z03: Sampling Distributions
Fall 2016
37 / 42
The Sampling Distribution of Sample Means: Simulation
Sampling From Normal Populations
Here’s a summary of the mean and standard deviation of each of these distributions:
n
1
2
10
mean
of 1000 x’s
µ
99.0 100.0
99.8 100.0
100.3 100.0
standard deviation
√
of 1000 x’s σ/ n
15.2
15.0
10.8
10.6
4.6
4.7
At all sample sizes, the sample mean x is an unbiased estimator of the population mean µ.
As the sample size grows, the sample means get less variable.
John Fox (McMaster University)
Soc 6Z03: Sampling Distributions
Fall 2016
38 / 42
The Sampling Distribution of Sample Means: Simulation
Sampling From Exponential Populations
I sampled from an exponential distribution, with “rate” parameter µ = 1.
In the exponential distribution, the mean and standard deviation are the same, so σ = 1.
The exponential distribution is used, for example, in survival analysis to model individuals’
survival time when the risk of dying (the “hazard”) is constant.
Unlike the normal distribution, which is symmetric, the exponential distribution is strongly
positively skewed: When there is a constant hazard, many individuals expire relatively early, a
few survive for relatively long times (because the group at risk is shrinking).
I selected 1000 samples each of size n = 1, n = 2, n = 5, and n = 25, calculating the
mean x for each sample, and making a histogram of the distribution of the 1000 sample
means.
John Fox (McMaster University)
Soc 6Z03: Sampling Distributions
Fall 2016
39 / 42
Fall 2016
40 / 42
The Sampling Distribution of Sample Means: Simulation
Sampling From Exponential Populations
0.0
0.5
Density
1.0
1.5
2.0
n=1
2
4
x
2.0
0
6
8
n = 2,
rescaled
0.0
0.0
0.5
0.2
Density
0.4
Density
1.0
1.5
0.6
n=2
2
4
_
x
6
8
0
1
2
_
x
1.0
0
4
5
0.8
n = 5,
rescaled
0.2
0.0
0.5
0.0
2
4
_
x
6
n = 25
0.5
1.0
2.0
2.5
0.0
0.0
0.5
0.5
Density
1.0
Density
1.0
1.5
_1.5
x
n = 25,
rescaled
1.5
2.0
0.0
8
2.0
0
0
John Fox (McMaster University)
3
Density
0.4
0.6
Density
1.0
1.5
2.0
n=5
2
4
_
x
6
8
0.4
0.6
0.8
1.0 _ 1.2
x
Soc 6Z03: Sampling Distributions
1.4
1.6
1.8
The Sampling Distribution of Sample Means: Simulation
Sampling From Exponential Populations
In each graph, the solid curve is for the true sampling distribution of x from an
√
exponential population; the broken curve is from the normal approximation N (µ, σ/ n ).
As the sample size grows, the sampling distribution of the x’s gets more and more normal.
John Fox (McMaster University)
Soc 6Z03: Sampling Distributions
Fall 2016
41 / 42
The Sampling Distribution of Sample Means: Simulation
Sampling From Exponential Populations
Here are the means and standard deviations of these distributions:
n
1
2
5
25
mean
of 1000 x’s
1.00
0.98
0.99
1.00
µ
1.00
1.00
1.00
1.00
standard deviation
√
of 1000 x’s σ/ n
0.96
1.00
0.68
0.71
0.43
0.45
0.20
0.20
As before, for all sample sizes, the sample mean x is an unbiased estimator of the
population mean µ.
Also as before, as the sample√
size grows, the sample means get less variable: The
standard deviation of x is σ/ n.
John Fox (McMaster University)
Soc 6Z03: Sampling Distributions
Fall 2016
42 / 42