SAMPLING DISTRIBUTIONS Download

Transcript
SAMPLING DISTRIBUTIONS
Parameter
Population characteristic
e.g. μ, σ, P, median,
percentiles etc.
Sample statistic
Any quantity computed
from values in a sample
e.g.
x , s, sample
proportion etc.
The value of a population characteristic is fixed.
The value of a statistic varies from one sample to another.
Hence, it is a random variable and its probability distribution
is known as its sampling distribution.
Population
↓
↓
↓
↓
Sample 1
Sample 2
Sample 3
Sample 4
2
PROPERTIES OF SAMPLING DISTRIBUTIONS
A point estimator is a formula that uses sample data to
calculate a single number (a sample statistic) that can be used
as an estimate of a population parameter.
e.g. x and s2 are point estimators of μ and σ2, respectively.
If the sampling distribution of a statistic has a mean equal to
the parameter being estimated, the statistic is an unbiased
estimator of that parameter, otherwise it is biased.
We have a population of x values whose histogram is the
probability distribution of x.
Select a sample of size n from this population and calculate a
sample statistic e.g.
x.
This procedure can be repeated indefinitely and generates a
population of values for the sample statistic and the histogram
is the sampling distribution of the sample statistic.
3
4
SAMPLING DISTRIBUTION OF THE MEAN
(THE CENTRAL LIMIT THEOREM)
Frequently we interested in μ and estimate it using x , so we
need to know about the sampling distribution of x .
Theory says that for random samples of size n from any
population
μx = E ( x ) = μ
σ
σx =
Std. Error of the Mean
n
Central Limit Theorem
If n is sufficiently large (≥30) the sampling distribution of x
will also be approximately normal
5
Example of the Central Limit Theorem in Practice:
Roll 30 dice and calculate the average (sample mean) of the
numbers that you get on each die. Now repeat this experiment
1000 times each time rolling 30 dice and computing a new
sample mean. Plot a histogram of the 1000 sample means that
you have obtained. This plot will look approximately normal
6
Example
Manufacturer claims the life of a battery type has a mean of
54 months and std dev of 6 months. A consumer group
purchases a sample of 50 of these batteries and tests them.
They find an average life of 52 months, what should they
conclude?
If manufacturer’s claim is true μ x =54 σ x =0.85
n>30 so use central limit theorem.
z=
x − μx
σx
52 − 54 ⎞
⎛
P ( x ≤ 52) = P⎜ z ≤
⎟ = P ( z ≤ −2.35) =.0094
⎝
0.85 ⎠
If the manufacturer’s claim is true then what the consumer
group observed is very unlikely - a more plausible explanation
is that the true value of μ is less than 54.