Download Sampling Distributions

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Sampling Distributions
Chapter 7
The Concept of a
Sampling Distribution



Repeated samples of the same size are
selected from the same population.
The same sample statistic is calculated
from the data in EACH sample.
The distribution of the sample statistics is
the SAMPLING DISTRIBUTION of that
sample statistic.
The Sampling Process
x1, x2 , ... , xn 
SAMPLE
POPULATION
μ
x and s
The Sampling Distribution
Repeated
Sampling
x1, x2 , x3 , ... , xN 
POPULATION
μ
Sampling
Distribution
What is Standard Error?
Standard Error has been identified as a
quantity that is not understood.
Is it a Standard Deviation?
Standard Error of what?
What does it tell us?
The purpose of this presentation is to
make the concept of Standard Error
clearer and more understandable.
The Sampling Process
30, 42, 48, 49, 61, 54, 41, 38, 59, 57
Histogram
Population
600
Calculate Mean = 47.9
Count
500
400
This sample mean is an
ESTIMATE of the
population mean.
300
200
SAMPLE
100
20
30
40
50
x
POPULATION
Mean = 50
60
70
80
We should not be surprised that the
estimate does not equal the true mean for
the population!
The Sampling Process
30, 42, 48, 49, 61, 54, 41, 38, 59, 57
Histogram
Population
Calculate Mean = 47.9
600
Plot the Sample Mean
Count
500
400
Means of the Separate Samples
300
Dot Plot
200
SAMPLE
100
20
30
40
50
x
60
70
80
POPULATION
Mean = 50
40
45
50
55
mean
60
The Sampling Distribution
Calculate means
for each sample
Repeated
Sampling
m1, m2, …
Means of the Separate Samples
Dot Plot
Histogram
Population
600
Count
500
400
Plot All
Sample
Means
300
200
100
40
45
50
55
60
mean
20
30
40
50
x
60
70
80
Sampling Distribution of
the Sample Means
What about this sampling
distribution?
Means of the Separate Samples
Dot Plot
Each dot represents a mean
from one of the samples.
Each sample mean is an
ESTIMATE of the population
mean.
Notice that center of this
graph is around 50 and the
spread ranges from 45 to 55.
40
45
50
55
mean
60
What about this sampling
distribution?
Means of the Separate Samples
Dot Plot
The mean of sampling
distribution (that is, the
mean of the sample means)
is the MEAN of the
population!
AND…
40
45
50
55
mean
60
We call the standard deviation
of the distribution of sample
means the STANDARD ERROR
OF THE ESTIMATE OF THE
POPULATION MEAN.
In Summary


STANDARD DEVIATION is a measure of
the spread of data in a population or in a
sample.
STANDARD ERROR is a measure of the
spread of the ESTIMATES of a measure of
a population calculated from repeated
sampling.
In short…
STANDARD
DEVIATION
STANDARD
ERROR

Variation
in DATA

Variation in
ESTIMATES
FROM SAMPLES
Point Estimators


When inferences are made from the
sample to the population, the sample
mean is viewed as an estimator of the
mean of the population from which the
sample was selected.
Similarly, the proportion of successes in a
sample is an estimator of the proportion of
successes in the population.
Properties of Point Estimators


The summary statistic should be UNBIASED,
that is the mean of the sampling distribution is
equal to the value you would get if you
computed the summary statistic using the entire
population. More formally, an estimator is
unbiased if its expected value equals the
parameter being estimated.
The summary statistic should have as little
variability as possible (be more precise than
other estimates) and should have a standard
error that decreases as the sample size
increases.
Population Sample
Parameter Statistic
Mean
µ
Standard
Deviation
ơ
s
Size
N
n
X
Sampling
Distribution
X
X
Properties of the Sampling
Distribution of the Sample Mean


If a random sample of size n is selected from a
population with mean µ and standard deviation σ, then
The mean  X of the sampling distribution of X equals
the mean of the population µ
mX= µ
Properties of the Sampling
Distribution of the Sample Mean


If a random sample of size n is selected from a
population with mean µ and standard deviation σ, then
The standard deviation,  X , of the sampling
distribution of X , sometimes called the standard error
of the mean, equals the standard deviation of the
population σ, divided by the square root of the sample
size n:
 X = σ/√n
*Only used when N>10n
Properties of the Sampling
Distribution of the Sample Mean



If a random sample of size n is selected from a
population with mean µ and standard deviation σ, then
The shape of the sampling distribution will be
approximately normal if the population is approximately
normal; for other populations, the sampling distribution
becomes more normal as n increases
This property is called the CENTRAL LIMIT THEOREM
(CLT)
Reasonably Likely Averages
Mean ± 1.96(SE)

1.96 is the z-score and comes from the
cut off point of the middle 95% of a
normal distribution
If the Sampling Distribution is
known…
Probability questions about sample statistics
can be answered. For example,
A simple random sample of 50 is selected
from a normal population with a mean of 50
and a standard deviation of 10. What is the
probability that the sample mean will be
greater than 53?
The Answer…
A simple random sample of 50 is selected from a normal
population with a mean of 50 and a standard deviation of
10. What is the probability that the sample mean will be
greater than 53?

Normal (  ,  )  Sampling Distribution Normal   , 

n



Normal  50,10   Sampling Distribution Normal  50,10

50 

x   53  50
z

 2.12  p  .017

10
n
50
Properties of the Sampling
Distribution of the sum of a Sample


If a random sample of size n is selected from a
distribution with mean µ and standard deviation σ, then
The mean of the sampling distribution of the sum is
µsum = nµ

The standard error of the sampling distribution of the
sum is
σsum=√n · σ

CLT applies
Sampling Distribution of the
Sample Proportion

We will now move from studying the
behavior of the sample mean to studying
the behavior of the sample proportions
(the proportion of “successes” in the
sample)
Properties of the Sampling Distribution of
the Number of Successes




If a random sample of size n is selected from a
population with proportion of successes, p, then the
sampling distribution of the number of successes X
Has mean µx = np
Has standard error σx = √np(1-p)
Will be approximately normal as long as n is large
enough
 As a guideline both np and n(1-p) are at least
10
np≥10 and n(1-p) ≥10
Example

The use of seat belts continues to rise in
the United States, with overall seat belt
usage of 82%. Mississippi lags behind the
rest of the nation—only about 60% wear
seat belts. Suppose you take a random
sample of 40 Mississippians. How many
do you expect will wear seat belts? What
is the probability that 30 or more of the
people in your sample wear seat belts?
Solution
Sampling Distributions of the
Sample Proportion


True proportion of successes is
represented by “p”
Sample proportion of successes is
represented by “p hat”
p hat = (# of successes)/(sample size)
Sampling Distribution of p-hat


How does p-hat behave? To study the
behavior, imagine taking many random
samples of size n, and computing a p-hat
for each of the samples.
Then we plot this set of p-hats with a
histogram.
Sampling Distribution of p-hat
Properties of p-hat



When sample sizes are fairly large, the shape of
the p-hat distribution will be normal.
The mean of the distribution is the value of the
population parameter p.
The standard deviation of this distribution is the
square root of p(1-p)/n.
ˆ) 
sd ( p
p (1  p )
n
As a guideline use np ≥ 10 and n(1-p) ≥ 10
Example

A.
B.
About 60% of Mississippians use seat
belts. Suppose your class conducts a
survey of 40 randomly selected
Mississippians.
What is the chance that 75% or more of
those selected wear seat belts?
Would it be quite unusual to find that
fewer than 25% of Mississippians
selected wear seat belts?
Solution
Related documents