• Study Resource
• Explore

Survey

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts

Probability interpretations wikipedia, lookup

Birthday problem wikipedia, lookup

Transcript
```Sampling
Distributions of
Proportions
•
•
The dotplot was a partial graph
Remember
thedistribution
skittles of
example.
of the sampling
all
sample proportions of sample size
We calculated the proportion of
40. If we found all the possible
orange
skittles &– this
marked
sample proportions
would it
be on
the dot
plots on the
board.
approximately
normal!
What shape did the n=40 dot
plot have?
Sampling Distribution
• Choose an SRS of size n from a
large population with population
proportion p having some
characteristic of interest. Let phat be the proportion of the
sample having that characteristic..
• We need to come up with some
formulas for the mean and
standard deviation.
Suppose we have a population of six people:
Melissa, Jake, Charles, Kelly, Mike, &
Brian
What is the proportion of females? 1/3
What is the parameter of interest in
this population?
Proportion of females
Draw samples of two from this population.
How many different samples are possible?
6C2
=15
Find the 15 different samples that are
possible & find the sample proportion of
the number of females in each sample.
Jake & Brian
Melissa & Jake
.5
Charles & Kelly
Melissa & Charles
.5
Melissa & Kelly
1
Charles & Mike
How does the mean of the
Melissa & Mike
.5
Charles & Brian
sampling distribution
(mp-hat)
Melissa & Brian
.5
Kelly
& Mike
compare to the
population
Jake & Charles
0
Kelly
& Brian
parameter
(p)?
Jake & Kelly
m.5p-hat = p
Mike & Brian
Jake & Mike
0
0
.5
0
0
.5
.5
0
Find the mean & standard deviation of all p-hats.
μpˆ
1

3
&
σ pˆ  0.29814
Formulas:
The mean of the
sampling
distribution.
X
pˆ 
n
m pˆ  p
The standard
deviation of the
sampling
distribution.
p1  p 
 pˆ 
n
Does the standard deviation of the
sampling distribution equal the equation?
NO -
σ pˆ 
 
1 2
3 3 1
2
3
 0.29814
WHY?
So – in order to calculate the
standard deviation of the
sampling distribution, we
MUST be sure that our sample
size is less than 10% of the
population!
We are sampling more
than 10% of our
population!
Assumptions (Rules of Thumb)
• Use this formula for standard
deviation when the population is
sufficiently large, at least 10 times as
large as the sample.
• Sample size must be large enough to
insure a normal approximation can be
used. We can use the normal
approximation when
np > 10 & n (1 – p) > 10
Why does the second assumption insure
an approximate normal distribution?
Remember back to binomial distributions
Suppose n = 10 & p = 0.1
(probability of a success),
a histogram of this
np > 10 & n(1-p) > 10
distribution is strongly
insures that the sample
skewed
right!
Now
use
n
100 &enough
p = 0.1
size is=large
to
(Now np >have
10!) While
the
a normal
histogramapproximation!
is still strongly
skewed right – look what
happens to the tail!
Based on past experience, a
bank believes that 7% of the
μpˆ  .07
not make payments on
.07 time.
.93
σ


.
01804
ˆ
p
The bank recently approved
200 loans.
Yes
–
200
np = 200(.07) = 14
and
standard deviation
n(1 - p) = 200(.93) = 186
What are the mean
of the proportion of clients in this group
who may not make payments on time?
Ncdf(.10,
Are assumptions
met? 1E99, .07, .01804) =
.0482
What is the probability that over 10% of
these clients will not make payments on
time?
Example #1
A polling organization asks an SRS of 1500 first year college
students whether they applied for admission to any other college.
In fact, 35% of all first-year students applied to colleges besides
the one they are attending. What is the probability that the
random sample of 1500 students will give a result within 2
percentage points of the true value?
STATE
PLAN
DO
CONCLUDE
Example #1
STATE:
We want to know the probability that a random
sample yields a result within 2 percentage points
of the true proportion.
We want to determine
P (.33  pˆ  .37)
Example #1
PLAN:
We have drawn an SRS of size 1500 from the population
of interest.
The mean of the sampling distribution of p-hat is 0.35:
mpˆ  0.35
Example #1
PLAN:
We can assume that the population of first-year college
students is over 15,000, and are safe to use the standard
deviation formula:
p(1  p )
(0.35)(0.65)
 pˆ 

 0.0123
n
1500
In order to use a normal approximation for the sampling
distribution, the expected number of successes and failures
must be sufficiently large:
np  10 and n(1  p)  10
1500(.35)  10 and 1500(.65)  10
Therefore, pˆ  N (0.35,0.0123)
Example #1
DO: Perform a normal distribution calculation
to find the desired probability:
P (.33  pˆ  .37)  .8961
Example #1
CONCLUDE: About 90% of all SRS’s of size
1500 will give a result within 2 percentage points
of true proportion.
Example #2
Suppose one student tossed a coin 200
times and found only 42% heads. Do you
believe that this is likely to happen?
 = 100 & n(1-p)
200(.5) = 100
.
5
(.
5
)
np = 200(.5)
=
  .0118
ncdf   ,.42,.5,
Since both
a normal curve!
 > 10, I can use
200
 m &  using the formulas.

Find
No – since there is
approximately a 1% chance of
this happening, I do not
believe the student did this.
Example #3
Assume that 30% of the students at
HH wear contacts. In a sample of 100
students, what is the probability that
more than 35% of them wear
contacts?
mp-hat = .3
& p-hat = .045826
np = 100(.3) = 30 & n(1-p) =100(.7) = 70
Check assumptions!
Ncdf(.35, 1E99, .3, .045826) = .1376