Download Document

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Regression toward the mean wikipedia , lookup

German tank problem wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Chapter 7:Interval Estimation
DEFINITION:
A parameter is a numerical value that would be calculated using all of the values of the units in the
population.
A statistic is a numerical value that is calculated using all of the values of the units in a sample.
Tip: One way to remember this distinction is this: The letter p is for population and parameter,
while the letter s is for statistic and sample.
Population
Unit
Population size N = 16
Sample size n=4
Sample
ESTIMATION:
 What is the mean number of delay hours of Northwest Airline Flights to
Chicago?
 What is the mean weight of actresses living in Hollywood?
 What is the mean of times per day a person in the U.S. uses a pain reliever?
Each of these questions is asking, “What is the value of the parameter?”
A confidence interval estimate for the population mean is an interval of values,
computed from the sample data, for which we can be quite confident that it contains
the population’s mean.
The confidence level is the probability that the estimation method will give an
interval that contains the parameter (in this case  ). The confidence level is denoted
by (1   ) , where  has common values of 0.10, 0.05, and 0.01, for 90%, 95%, and
99% confidence levels respectively.


N  , 

n

Let’s Think About It!
Recall, The CLT states that for sufficiently large
samples, X ~N(  ,  / n ).
95%
  
Use the Empirical Rule to answer the following questions
(
n

x
 n
)
a. 68% of the x ' s fall with within 1 standard deviation of the mean. This is
equivalent to saying that the mean  that the mean is within 1standard
deviation of the average of a sample 68% of the times.
Based on this fact, can you construct the interval that has 68% chance of
containing the mean?
b. 95% of the x ' s fall with within almost 2 standard deviation of the mean. This is
equivalent to saying that the mean  is within almost 2 standard deviation of
the average of a sample 95% of the times.
Based on this fact, can you construct the interval that has 95% chance of
containing the mean?
c. 99.7% of the x ' s fall with within almost 3 standard deviation of the mean. This
is equivalent to saying that the mean  is within almost 3 standard deviation
of the average of a sample 99.7% of the times.
Based on this, can you construct the interval that has 99.7% chance of
containing the mean?
X
Confidence Interval to Estimate Population Mean µ Using a Large
Sample:
Formula for the ( 1  )% Confidence Interval of the Mean when Sample size
n is Large:
X  z / 2
s
n
Where z  / 2 = C.V. for a two tailed Z test of the mean.
= invnorm(α/2,0,1)
Let's Do It! 1
What is the 95% confidence interval for  using a sample having the following
statistics n  30, x  22,   10 ?
What is the 90% confidence interval for  using a sample having the following
statistics n  38, x  1.82,   5.1 ?
Let's Do It! 2
The height of a random sample of 50 college students showed a mean of 174.5 cm.
Construct a 99% confidence interval for the mean height of college students if the
population of college students has a standard deviation of 6.9 cm.
Confidence Interval to Estimate Population Mean µ Using a Small
Sample:
Formula for the ( 1  )% Confidence Interval of the Mean when Sample size n is
Large:
X  t / 2
s
n
Where t  / 2 = C.V. for a two tailed T- test of the mean
= invT(α/2,df)
This interval gives potential values for the population mean  based on just one
sample mean x . This interval is based on the assumption that the data are a
random sample from a normal population with unknown population standard
deviation . If the sample size is large, the assumption of normality is not so crucial.
Let's Do It! 3
A random sample of 25 bottles of buffered Aspirin contained on
average 325.05 mg of aspirin with a standard deviation of 0.5
mg. Assuming normality,
• What is the distribution to be used for interval estimation of
the mean Aspirin content?
• Construct a 90% confidence interval for the mean content of
Aspirin.
Let's Do It!4 Skin Cancer
A dermatologist is investigating a certain skin cancer. Twenty five
rats have this cancer and are treated with a new drug. The
dermatologist is interested in the number of hours until the cancer
is gone. He found that the sample produced an average of 322
hours and a standard deviation of 101 hours. Assuming normality,
a. Compute a 90% confidence interval for the mean number
of hours.
b. Interpret the confidence interval constructed above.
Let's Do It! 5 Jogging and Pulse Rate
A random sample of 21 US adult males who jog at least 15 miles
per week is taken and their pulse rate is measured. The sample had
an average pulse rate of 52.6 beats /minute with a standard
deviation of 3.22 beats /minute.
a. Find a 95% confidence interval for the mean pulse rate of all
US males who jog at least 15 miles per week. Assuming pulse
rate is normally distributed.
b. Interpret the interval obtained above.
c. If the mean pulse rate of all US adult males is approximately
72 beats/minute. Does it appear that jogging at least 15 miles
per week reduces the mean pulse rate? Explain
Homework Posted. Prepare for Quiz over this part of the course.