Download chapter 11 & 12

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
SECTION 10.2
Estimating a Population Mean
What’s the difference between
what we did in Section 10.1 and
what we are beginning in
Section 10.2?
In reality, the standard deviation σ of the
population is unknown, so the procedures
from last section are not useful. However,
the understanding of the logic of the
procedures will continue to be of use.
In order to be more realistic, σ is estimated from
the data collected using s
Conditions for Inference about a
Population Mean
The data is an SRS from the population
2. Observations from the population have a
normal distribution with an unknown mean
() and unknown standard deviation (σ)
3. Independence is assumed for the individual
observations when calculating a confidence
interval. When we are sampling without
replacement from a finite population, it is
sufficient to verify that the population is at
least 10 times the sample size.
1.
CAUTION
Be sure to check that the conditions
for constructing a confidence
interval for the population mean are
satisfied before you perform any
calculations.
ROBUSTNESS
ROBUST: Confidence levels do not
change when certain assumptions are
violated
Fortunately for us, the t-procedures are
robust in certain situations.
Therefore . . .
This is when we use the t-procedures:
It’s more important for the data to be
an SRS from a population than the population has a
normal distribution
If n is less than 15, the data must be normal to use tprocedures
If n is at least 15, the t-procedures can be used
except if there are outliers or strong skewness
If n≥30, t-procedures can be used even in the
presence of strong skewness, but outliers must still
be examined
Essentially, as long as there are no significant
departures from Normality (especially outliers) then
the t procedures still work quite well.
Standard Error
In this setting, each sample is a part of a
sampling distribution that is a normal
distribution with a mean equal to the
population’s mean
Since we do not know σ, we will replace
the standard deviation formula of 
with this formula: s
n
n
This is called the standard
the sample mean
x
error of
Degrees of Freedom
Commonly listed as df
Equal to n-1
When a t-distribution has k degrees of
freedom, we will write this as t(k)
When the actual df does not appear in
Table C, use the greatest df available that
is less than your desired df
– This guarantees a wider confidence interval
than needed to justify a given confidence level
Density Curves for
t Distributions
Bell-shaped and symmetric
Greater spread than a normal curve
As degrees of freedom (or sample size)
increases, the t density curves appear
more like a normal curve
Confidence Intervals
x
s
± t*
n
– t* is the upper (1-C)/2 critical value for the t(n-1)
distribution
– We find t* using the table or our calculator
t*=invT(area to left of t*, df)
– We interpret these the same way we did in the last
chapter.
– This interval is exactly correct when the population
distribution is Normal and is approximately correct for
large n in other cases.
INFERENCE TOOLBOX (p 631)
DO YOU REMEMBER WHAT THE STEPS ARE???
Steps for constructing a CONFIDENCE INTERVAL:
1—PARAMETER—Identify the population of interest
and the parameter you want to draw a conclusion
about.
2—CONDITIONS—Choose the appropriate inference
procedure. VERIFY conditions (SRS, Normality,
Independence) before using it.
3—CALCULATIONS—If the conditions are met, carry
out the inference procedure.
4—INTERPRETATION—Interpret your results in the
context of the problem. CONCLUSION,
CONNECTION, CONTEXT(meaning that our
conclusion about the parameter connects to our work
in part 3 and includes appropriate context)
Example: GOT MILK?
A milk processor monitors the number of bacteria per milliliter in raw milk
received for processing. A random sample of 10 one-milliliter specimens
from milk supplied by one producer give the following data:
5370, 4890, 5100, 4500, 5260, 5150, 4900, 4760, 4700, 4870
Construct a 90% confidence interval.
--We want to estimate  = the mean number of bacteria per
milliliter in all of the milk from this supplier
--Since we don’t know σ, we should construct a one-sample t
interval for .
– We must be confident that the data are an SRS from the producer’s milk.
We must learn how the sample was chosen to see if it can be regarded
as an SRS (we are only told that it is a “random sample”).
– A boxplot and a Normal probability plot of the data show no outliers and
no strong skewness. This gives us little reason to doubt the Normality of
the population from which this sample was drawn. In practice, we would
probably rely on the fact that past measurements of this type have been
roughly Normal.
– Since these measurements came from a random sample of specimens,
they should be independent (assuming that there were many, at least
100, one-milliliter specimens available at the milk processing facility).
Example: GOT MILK? Cont.
--Entering these data into a calculator gives
x =4950 and s=268.45. So a 90% confidence
interval for the mean bacteria count per milliliter in this
producer’s milk is
s
268.45
 4950  155.6
x t*
 4950  1.833
n
10
df = 10-1 = 9
(4794.4, 5105.6)
--We can say that we are 90% confident that the
actual mean number of bacteria per milliliter of milk
from this supplier is between 4794.4 and 5105.6
because we used a method that yields intervals such
that 90% of all these intervals will capture the true
mean desired.
Paired t Procedures
Recall, matched pairs studies are a form of block
design in which just two treatments are being
compared
Also, experiments are rarely done on randomly
selected subjects. Random selection allows us to
generalize results to a larger population, but random
assignment of treatments to subjects allows us to
compare treatments.
Be careful to distinguish a matched pairs setting from
a two-sample setting.
The real key is independence.
TREAT THE DIFFERENCES from a matched pairs
study as a single sample.
TECHNOLOGY
As always, you will be allowed unrestricted use of your
calculator on quizzes and tests (as well as the actual
AP Exam). For this reason, ALWAYS be certain to
write down the values of key numbers that are being
used (means, standard deviations, degrees of
freedom, significance levels, etc.) along with results of
the calculator procedures in order to receive full credit.
The calculator information is available in your book on
pages 661-662.
We are now using the T Interval instead of the Z
Interval
Plug in exactly what you are asked for