Download Sampling Distribution Proportion

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
Sampling Distribution
of a Sample Proportion
Lecture 26
Sections 8.1 – 8.2
Mon, Nov 1, 2004
Parameters and Statistics

The purpose of a statistic is to estimate a
population parameter.
A sample mean is used to estimate the population
mean.
 A sample proportion is used to estimate the
population proportion.

Example

Example 8.1, p. 464.
The Census Bureau surveys 3000 employees and
asks them, “Have the job skills demanded by your
job increased over the past few years?”
 57% replied, “Yes.”
 That is a sample proportion.
 What is the population proportion?

Some Questions





What if the survey were repeated?
Would the survey results again be 57%?
Would the sample proportion be close to 57%?
Might it be 99%?
Might it be 1%?
Some Questions



We hope that the sample proportion is close to
the population proportion.
How close can we expect it to be?
Would it be worth it to collect a larger sample?
If the sample were larger, would we expect the
sample proportion (probably) to be closer to the
population proportion?
 How much closer?

The Sampling Distribution of a
Statistic

Sampling Distribution of a Statistic – The
distribution of values of the statistic over all
possible samples of size n from that population.
The Sample Proportion





Let p be the population proportion.
Then p is a fixed value (for a given population).
Let p^ (“p-hat”) be the sample proportion.
Then p^ is a random variable; it takes on a new
value every time a sample is collected.
The sampling distribution of p^ is the
probability distribution of all the possible values
of p^.
Example



Suppose that this class is 1/3 freshmen.
Suppose that we take a sample of 2 students,
selected with replacement.
Find the sampling distribution of p^.
Example
1/3
1/3
F
P(FF) = 1/9
N
P(FN) = 2/9
F
P(NF) = 2/9
N
P(NN) = 4/9
2/3
2/3
1/3
N
F
2/3
Example


Let X be the number of freshmen in the sample.
The probability distribution of X is
x
P(X = x)
0
4/9
1
4/9
2
1/9
Example


Let p^ be the proportion of freshmen in the
sample.
The sampling distribution of p^ is
x
P(p^ = x)
0
4/9
1/2
4/9
1
1/9
Simulating Sampling with the TI83


Use the TI-83 to simulate sampling 2 people
(with replacement) from a population in which
1/3 are freshmen.
Use the function randBin(n, p).
n = sample size (n = 2).
 p = proportion of freshmen (p = 1/3).


The function will report the number of
freshmen in the sample.
Example






Now do it for a sample of size n = 30.
Use a seed of 63.
We find that randBin(30, 1/3) = 9.
This represents a sample proportion of 9 out of
30, or 9/30 = 0.30.
If we press ENTER several more times, we get
11, 9, 14, 6, and 16.
These represent sample proportions of 11/30,
9/30, 14/30, 6/30, and 16/30.
Example


The expression
randBin(n, p, k)
will compute randBin(n, p) k times and put the
results in a list.
With a seed of 94, randBin(30, 1/3, 100)
produces the list
{11, 14, 8, 10, 10, 5, 13, 9, 9, …}.
Example

If we divide each value by 30, we get the sample
proportions
{11/30, 14/30, 8/30, 10/30, 10/30, …}.
The Histogram
15
10
5
0.1
0.2
0.3
0.4
0.5
0.6
p^
Larger Sample Size
Now we will select samples of size 120 instead
of size 30.
 Set the seed to 216.
 randBin(120, 1/3, 100) produces
{44, 33, 43, 41, 38, 44, 46, 43, …}
 The sample proportions are
{44/120, 33/120, 43/102, 41/120, 38/120, …}

The Histogram
25
20
15
10
5
0.1
0.2
0.3
0.4
0.5
0.6
p^
Observations and Conclusions


Observation #1: The values of p^ are clustered
around p.
Conclusion #1: p^ is probably close to p.
Observations and Conclusions



Observation #2: As the sample size increases,
the clustering is tighter.
Conclusion #2a: Larger samples give more
reliable estimates.
Conclusion #2b: For large sample sizes, we can
make very good estimates of the value of p.
More Observations and Conclusions

Observation #3: The distribution of p^ appears
to be approximately normal.
The Histogram
15
10
5
0.1
0.2
0.3
0.4
0.5
0.6
p^
The Histogram
15
10
5
0.1
0.2
0.3
0.4
0.5
0.6
p^
One More Conclusion


Conclusion #3: We can use the normal
distribution to calculate just how close to p we
can expect p^ to be.
However, we must know  and  for the
distribution of p^.
The Sampling Distribution of p^

It turns out that the sampling distribution of p^
is approximately normal with the following
parameters.
Mean of pˆ  p
p1  p 
Variance of pˆ 
n
Standard deviation of pˆ 
p1  p 
n
The Sampling Distribution of p^

The approximation to the normal distribution is
excellent if
np  5 and n1  p  5.
Example


Suppose 51% of the population plan to vote for
candidate X, i.e., p = 0.51.
What is the probability that an exit survey of
1000 people would show candidate X with less
than 45% support, i.e., p^  .45?
Example


First, describe the sampling distribution of p^ if
the sample size is n = 1000.
p^ is approximately normal.



Check: np = 510  5 and n(1 – p) = 490  5.
p^ = 0.51.
p^ = ((.51)(.49)/1000) = 0.01581.
Example




The z-score of 0.45 is z = (0.45 – 0.51)/.01581
= -3.795.
P(p^  0.45) = P(Z  -3.795)
= 0.00007385 (not likely!)
That is why surveys work (within the margin of
error) and that is why people are saying that the
exit polls failed yesterday.
We have computed the p-value of 0.45 under the
null hypothesis that p = 0.51!
Let’s Do It!



Let’s do it! 8.5, p. 484 – Probabilities about the
Proportion of People with Type B Blood.
Let’s do it! 8.6, p. 485 – Estimating the
Proportion of Patients with Side Effects.
Let’s do it! 8.7, p. 487 – Testing hypotheses
about Smoking Habits.

See Example 8.5, p. 486 – Testing Hypotheses about
the Proportion of Cracked Bottles.