Download AP Statistics Section 10.1 C Determining Necessary Sample Size

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Probability wikipedia , lookup

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
AP Statistics Section 10.1 C
Determining Necessary Sample
Size
Consider the confidence interval x  z  n for the
mean of a population where  is known. The
user chooses the confidence level and the
margin of error automatically follows from this
choice.

Ideally, we would like both high confidence
and a small margin of error. High
confidence says that our method almost
always gives correct answers. A small
margin of error says that we have pinned
down the parameter quite nicely.
An equivalent expression for the
z
margin of error is n . Since
the expression has z* and  in
the numerator and n in the
denominator, the margin of
error gets smaller when:

z* gets smaller. This happens when
_______________________
C gets smaller
So there is a trade-off between the
confidence level and the margin of
error. To obtain a
smaller margin of error from the
same data, you must be willing to
accept lower confidence.
 gets smaller. Remember,  is
a fixed value in the population and
can’t be changed.
n gets larger. Now this is
something that we can control.
For example, in order to cut the
margin of error in half, we need to
take ___
4 times as many
observations.
A wise user of statistics never plans data
collection without planning the inference
at the same time. To determine the
sample size n that will yield a confidence
interval for a population mean with a
specified margin of error, m, set the
expression for the margin of error to be
less than or equal to m and solve for n.
Example: Researchers would like to estimate the mean
cholesterol level  of a particular variety of monkey that is
often used in laboratory experiments. They would like their
estimate to be within 1 mg/dcl of blood of the true value
of  at a 95% confidence level. A previous study involving this
variety of monkey suggests that the standard deviation
of cholesterol level is about  9.85 
mg/dcl.
n What is the
minimum number of monkeys needed to generate a
satisfactory estimate?
E 1
z 
1
n
(1.96)(5)
1
n

9.8  n
96.04  n
n  97
z  1.96
Always round up to the next
whole number when finding n.
It is the size of the sample that
determines the margin of error.
The size of the population does not
influence the sample size we need
- as long as the population is at
least 10 times as large as the
sample.
CAUTION! CAUTION!
The data must be an SRS from the
population.
Nonresponse and other practical
problems can frustrate choosing an
SRS.
The margin of error in a confidence
interval covers only random
sampling errors. The margin of
error indicates how much error can
be expected because of chance
variation in randomized data
production.
There is no correct method for
inference from data haphazardly
collected or biased.
Different methods are needed for
different designs. The CI formula
isn’t correct for probability
samples more complex than
an SRS. There are correct methods
for other designs.
Outliers can distort results.
Outliers can strongly influence ___,
x
which can have a large effect on
the confidence interval.
The shape of the population
distribution matters. Examine your
data carefully for skewness and
other signs of non-Normality.
You must know the standard
deviation,  , of the population.
Finally, you must understand what statistical
confidence does not say. Recall our confidence
interval of (107.8, 116.2) for the mean IQ score
for all BCU freshmen. We are 95% confident
that the mean IQ score for all BCU freshmen lies
between 107.8 and 116.2. That is, these
numbers were calculated by a method that gives
correct results in 95% of all possible samples.
We cannot say that the probability is 95%
that the true mean falls between 107.8 and
116.2. No randomness remains after we
draw one particular sample and get from it
one particular interval. The true mean
either _______________
is or is not between 107.8
and 116.2.
The probability calculations of
standard statistical inference
describe how often the
__________
process gives correct answers.