Download 6 - Faculty Website Listing

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Degrees of freedom (statistics) wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

German tank problem wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
6.5 Interval Estimation
Recall,
Sampling Distribution of X , the Sample Mean
If a simple random sample of size n is taken from a
population having population mean  and population
standard deviation  , and if the original population is
normally distributed, then
  
X is N   ,
,
n


If the original population is not necessarily normal, but the
sample size n is large enough ( n  30 ), then
  
X is approx. N   ,
 (central limit theorem),
n

DEFINITION:
A parameter is a numerical value that would be calculated using all of the
values of the units in the population.
A statistic is a numerical value that is calculated using all of the values of the
units in a sample.
Tip: One way to remember this distinction is this: The letter p is for population
and parameter, while the letter s is for statistic and sample.
Population
Unit
Population size N = 16
Sample size n=4
Sample
ESTIMATION:
 What is the mean number of delay hours of Northwest Airline Flights to
Chicago?
 What is the mean weight of actresses living in Hollywood?
 What is the mean of times per day a person in the U.S. uses a pain
reliever?
Each of these questions is asking, “What is the value of the parameter?”
A confidence interval estimate for the population mean is an
interval of values, computed from the sample data, for which we
can be quite confident that it contains the population mean.
The confidence level is the probability that the estimation method
will give an interval that contains the parameter (in this case  ).
The confidence level is denoted by (1   ) , where  has common
values of 0.10, 0.05, and 0.01, for 90%, 95%, and 99% confidence
levels respectively.
In class, you were requested to construct the distribution of the average of all
possible samples of size 2 from a small population of size 4. At the end of the
activity we had the following conclusions:
-The distribution of the x ' s has a bell shape( Normal).
- The mean of the x ' s is the same as the population mean.
-The standard deviation of the x ' s is

n
.
This Lead us to the conclusion that X ~N(  ,  / n ) for large samples.


N  , 

n


Let’s Think About It!
Recall, The CLT states that for sufficiently large
samples, X ~N(  ,  / n ).
Use the Empirical Rule to answer the following
questions
95%
  
(
n
x

 n
)
a. 68% of the x ' s fall with within 1 standard deviation of the mean. This is
equivalent to saying that the mean  that the mean is within 1standard
deviation of the average of a sample 68% of the times.
Based on this fact, can you construct the interval that has 68% chance
of containing the mean?
b. 95% of the x ' s fall with within almost 2 standard deviation of the mean.
This is equivalent to saying that the mean  is within almost 2 standard
deviation of the average of a sample 95% of the times.
Based on this fact, can you construct the interval that has 95% chance
of containing the mean?
c. 99.7% of the x ' s fall with within almost 3 standard deviation of the mean.
This is equivalent to saying that the mean  is within almost 3 standard
deviation of the average of a sample 99.7% of the times.
Based on this, can you construct the interval that has 99.7% chance of
containing the mean?
X
Formula for the ( 1  )% Confidence Interval of the
Mean when Sample size n is Large:
x  z1 / 2 / n
For a 90% confidence interval, z1 / 2 = 1.65; for a 95%
confidence interval, z1 / 2 =1.96; and for a 99% confidence
interval, z1 / 2 =2.58.
Let's Do It! 1
What is the 95% confidence interval for  using a sample having the
following statistics n  30, x  22,   10 ?
What is the 90% confidence interval for  using a sample having the following
statistics n  38, x  1.82,   5.1 ?
Let's Do It! 2
The height of a random sample of 50 college students showed a mean of
174.5 cm. Construct a 99% confidence interval for the mean height of college
students if the population of college students has a standard deviation of 6.9
cm.
Population Standard Deviation σ Is Unknown
When

is unknown, we use the sample standard deviation s
instead. The replacement of

by s changes the distribution of the
sample mean X .
When

is unknown, the distribution of X is no longer N( ,  / n )
when the sample size n is large. Instead, X is said to follow another
distribution called the Student’s t-Distribution.
The Student’s t-Distribution with (n-1) degrees of freedom
DEFINITION:
When data is used to estimate the standard deviation of a statistic,
the result is called the standard error of the statistic.
The standard error of the mean (SEM) is the estimated standard
deviation of the sample mean, SEM=
s
.
n
Properties of the Student’s t-distribution
 The t-distribution has a symmetric bell-shaped density centered at 0,
similar to the N(0,1) distribution.
 The t-distribution is “flatter” and has “heavier tails” than the N(0,1)
distribution.
 As
the
sample size
increases,
the
tdistribution
approaches
the N(0,1)
distribution.
Confidence Interval for a Population Mean 

X  t n 1,u  s

n

Where t(n-1,u ) is an appropriate u-percentile of the t-distribution.
This interval gives potential values for the population mean  based on just
one sample mean x . This interval is based on the assumption that the data
are a random sample from a normal population with unknown population
standard deviation  . If the sample size is large, the assumption of
normality is not so crucial.
How to use Table F to find
t n 1,u :
Example: Find the t n 1,u value for a 95% confidence interval when the sample
size is 22.
Solution
The d.f. = (22  1)= 21, and u=1-.05/2=.975.
Find 21 in the far left column and .975 in the row labeled u. The intersection
where the two meet gives the value for t 21.0.975 , which is 2.080.
Let's Do It! 3
A random sample of 25 bottles of buffered Aspirin contained on
average 325.05 mg of aspirin with a standard deviation of 0.5
mg. Assume that the aspirin content is normally distributed.
• What is the distribution to be used for interval estimation of
the mean Aspirin content? Why?
• Construct a 90% confidence interval for the mean content of
Aspirin.
Let's Do It!4 Skin Cancer
: A dermatologist is investigating a certain skin cancer. Twenty five
rats have this cancer and are treated with a new drug. The
dermatologist is interested in the number of hours until the cancer
is gone. He found that the sample produced an average of 322
hours and a standard deviation of 101 hours. Assuming normality,
a. Compute a 90% confidence interval for the mean number
of hours.
b. Interpret the confidence interval constructed above.
Let's Do It! 5 Jogging and Pulse Rate
A random sample of 21 US adult males who jog at least 15 miles
per week is taken and their pulse rate is measured. The sample had
an average pulse rate of 52.6 beats /minute with a standard
deviation of 3.22 beats /minute.
a. Find a 95% confidence interval for the mean pulse rate of all
US males who jog at least 15 miles per week. Assuming pulse
rate is normally distributed.
b. Interpret the interval obtained above.
c. If the mean pulse rate of all US adult males is approximately
72 beats/minute. Does it appear that jogging at least 15 miles
per week reduces the mean pulse rate? Explain
Using the TI
Construct a 99% confidence interval estimate for the mean if we
have x  12.6 and s  4.4 based on n  61 observations.
For the TI, a confidence interval for a population mean when the
population standard deviation  is unknown is called the onesample t-Interval and abbreviated TInterval. This is option 8 under
the TESTS menu obtained from the STAT button. You can have the
sample data entered into a list, say L1, or just enter the Stats (the
sample mean, sample standard deviation, and sample size). The
steps are summarized below and the corresponding input and
output screens are shown. Note that the Calculate option produces
an output screen that provides the confidence interval (11.101,
14.099), the sample mean x =12.6, sample
standard deviation s = 4.4, and the sample size n = 61.
Statistic Humor:
Did you hear about the statistician who was thrown in jail? He now has zero degrees of freedom.
Homework page 217: 13, 14, 15, 16, 29, 30, 33, 41.