Download Class Notes

Document related concepts

Degrees of freedom (statistics) wikipedia , lookup

History of statistics wikipedia , lookup

Taylor's law wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Statistical inference wikipedia , lookup

German tank problem wikipedia , lookup

Student's t-test wikipedia , lookup

Misuse of statistics wikipedia , lookup

Transcript
Chapter 6
Estimates and Sample Sizes
Lesson 6-1/6-2, Part 1
Estimating a Population Proportion
This chapter begins the beginning of
inferential statistics.
There are two major applications of inferential
statistics involve the use of sample data to:
1. Estimate the value of a population
parameter (proportions, means and
variances).
2. Test some claim (or hypothesis) about a
population.
Overview


Introduce methods for estimating values
of these important population parameters:
proportions, means, and variances.
Present methods for determining samples
sizes necessary to estimate those
parameters.
Assumptions


Randomization condition – Were the data
sampled at random or generated from a
properly randomization experiment?
10% Condition (N ≥ 10n) – Samples are
almost always drawn without replacement.
If the sample exceeds 10% of the
population, the probability of a success
changes so much during the sampling that
our Normal model may no longer be
appropriate.
Assumptions

Normal Approximation – The model that we use
for inference is based on the Central Limit
Theorem. The sample must be large enough to
make the sampling model for the sample
proportions approximately Normal.
 np
ˆ  5 and nqˆ  5.
Notations for Proportions
p
population proportion
p  x
n
sample proportion (p hat) of x
successes in a sample size of n

q  1  p
sample proportion of failures in a
sample of size n
Point Estimate


A point estimate is a single value (or
point) used to approximate a population
parameter.
The sample proportion  p̂  is the best point
estimate of the population proportion  p  .
Confidence Interval or
Interval Estimate


A confidence interval (or interval estimate) is a
range (or an interval) of values used to
estimate the true value of a population
parameter.
A confidence interval is sometimes abbreviated
as CI.
Confidence Interval



A confidence level is the probability 1 – α (often
expressed as the equivalent percentage value) that
is the proportion of times that confidence interval
actually does contain the population parameter,
assuming that the estimate process is repeated a
large number of times.
This is usually 90% (α = 10%), 95% (α = 5%) or 99%
(α = 1%)
The confidence level is also called the degree of
confidence, or the confidence coefficient.
Interpreting the Confidence Level


The statement “95% confident” means in
repeated sampling, 95 percent of the intervals
produced using this method will contain the
proportion of adult Minnesotans who would
respond no to the question “photo cop
legislation.”
If 1000 samples of size 829 were taken about
1000(0.95) = 950 of the intervals would contain
the parameter p and about 50 would not.
What can we really say about p?

“51 % of all Minnesotans are opposed to
photo-cop legislations.”


It would be nice to be able to make absolute statements
about population values with certainty, but we just don’t
have enough information do that.
There’s no way to be sure that the population proportion is
the same as the sample proportion; in fact, it almost
certainly isn’t. Observations vary. Another sample would
yield a different sample proportion.
What can we really say about p?

“It is probably true that 51 % of all
Minnesotans are opposed to photo-cop
legislations.”

No. In fact we can be pretty sure that whatever the true
proportion is, it’s not exactly 51%. So the statement is not
true.
What can we really say about p?

“We don’t know exactly what proportion of
Minnesotans are opposed to photo-cop
legislations but we know that it’s between
the interval from 48% and 54%.”

This it getting closer, but we still can’t certain.
We can’t know for sure that the true proportion
is in this range – or any particular range.
What can we really say about p?

“We don’t know exactly what proportion of
Minnesotans are opposed to photo-cop
legislations but interval from 48% and 54%
probably contains the true value.”

We’ve now fudge twice – first by giving an
interval and second by admitting that we only
think the interval “probably” contains the true
value. This statement is true.
What can we really say about p?

The last statement may be true, but it’s a
bit wishy-washy. We can tighten it up bit
quantifying what we meant by “probably”.

We are 95% confident that between 48% and
54% of Minnesotans opposed photo-cop
legislation.
Critical Value

A critical value is the number on the
borderline separating sample proportions
that are likely to occur from those that are
unlikely to occur.
α
2
α
Confidence
Level
z 0
zα
2
2
Example – Page 312, #2
Find the critical value that corresponds to the given
confidence level of 90%
  1  0.90  0.10
90%
z  z0.10  z0.05  1.645
2
2
0.05
0.05
invNorm(1 – 0.05,0,1)
 1.645
1.645
z0
1.645
The most common critical values
are:
Confidence Level

90%
0.10
1.645
95%
0.05
1.96
99%
0.01
2.575
Critical Values, zα 2
Margin of Error

When data from a simple random
sample are used to estimate a
population proportion p, the margin of
error, denoted by E, is the maximum
likely (with probability 1 – α) difference
between the observed proportion p̂
and the true value of the population
proportion p.
 
Margin of Error
E  z
2
ˆˆ
pq
n
Page 312, #14
Assume that a sample is used to estimate the population proportion p.
Find the margin of error E that corresponds to n = 1200, x = 400
99% confidence.
  1  0.99  0.01
z 2  z0.01 2  z0.005  2.576
invNorm(1-0.005,0,1) = 2.576
E  z 2
pˆ 
x 400

 0.33
n 1200
ˆˆ
pq
(0.33)(0.67)
 2.576
 2.576 .01357   .0350
n
1200
Confidence Interval for the
Population Proportion
pˆ  E  p  pˆ  E
 pˆ  E , pˆ  E 
pˆ  E
Find the Point Estimate and Margin of
Error From a Confidence Interval
Point Estimate:
Margin of Error:
(UCL)  ( LCL)
pˆ 
2
UCL    LCL 

E
UCL – Upper Confidence Limit
LCL – Lower Confidence Limit
2
Example – Page 312, #6
Express the confidence interval 0.456 < p <0.496 in the form
UCL    LCL  0.496  0.456

pˆ 

 0.476
2
2
UCL    LCL  0.496  0.456

E

 0.020
2
p  0.476  0.020
2
pˆ  E.
Example – Page 312, #10
Interpreting Confidence Interval Limits: Use the given
confidence interval limits to find the point estimate p̂ and
the margin of error E.
0.278  p  0.338
UCL    LCL  0.278  0.338

pˆ 

 0.308
2
2
UCL    LCL  0.338  0.278

E

 0.030
2
2
Example – Page 312, #20
Use the sample data and confidence level to construct the confidence
level estimate of the population proportion p.
n = 2001, x = 1776, 90% confidence
Check assumptions.
npˆ  5
nqˆ  5
x 1776
pˆ  
 0.8876
n 2001
Example – Page 312, #20
0.90
0.05
1.645
CI  90%
0.05
z  0 1.645
α  1  .90  0.10
pˆ  0.8876
n  2001
z0.05  invnorm(1  0.05,0,1)  1.645
Example – Page 312, #20
pˆ  zα
2
ˆˆ
pq
n
0.8876(0.1124)
0.8876  1.645
2001
0.8876  0.116  [0.876,0.899]
0.876  p  0.899
CI  90%
α  1  .90  0.10
pˆ  0.8876
n  2001
z0.05  1.645
Example – Page 312, #20
Using the TI
Stat/Tests/A:1-PropZint
Example – Page 312, #20
0.876  p  0.899
pˆ  0.8876
Lesson 6-1/6-2, Part 2
Estimating a Population Proportion
Sample Size for
Estimating Proportion p
2
p̂ is known:
ˆˆ
 z 2  pq
n
2
E
2
p̂
is unknown:
 z 2   0.25
n
2
E
Example – Page 312, #22
Use the given data to find the minimum sample size required to
estimate a population proportion or percentage.
Margin of error: 0.038; confidence level: 95%; p̂ and q̂
unknown
  1  .95  0.05
z 2  z0.05  z0.025  (1  0.025,0,1)  1.96
2
2
 z 2   0.25 (1.96)2  0.25
n

 665.10  666
2
2
E
(0.038)
Example – Page 313, #26
In 1920 only 35% of U.S. Households had telephones, but that rate
is now much higher. A recent survey of 4276 randomly selected
households showed that 4019 of them had telephones (based on the
data from U.S. Census Bureau). Using those survey results and 99%
confidence level, the TI-83 Plus calculator displays is as shown.
A. Write a statement that interprets the
confidence level.
We are 99% certain that the interval from
93.053% to 94.926% contains the true
percentage of U.S. households having
telephones.
Example – Page 313, #26
B. Based on the preceding results, should pollsters be
concerned about results from surveys conducted by
phone.
Yes. Based on the results from part (a), about 5% to 7%
of the population does not have telephone, so those
people are missed.
Procedure for Constructing a
Confidence Interval for p




Identify the population of interest and the
parameter you want to draw conclusions about.
Choose the appropriate inference procedure. Verify
the conditions for using the selected procedure.
If the conditions are met, carry out the inference.
Interpret your results in the context of the problem.
Example – Page 313, #28
Death Penalty Survey: In a Gallup Poll, 491 randomly selected adults
were asked whether they are in favor of the death penalty for a person
convicted of murder, and 65% of them said that they were in favor.
A. Find the point estimate of the percentage of adults
who are in favor of this death penalty.
65% is the point estimate
Example – Page 313, #28
B. Find a 95% confidence interval estimate of the
percentage of adults who are in favor of this death
penalty.
Step 1 – Identify the population of interest and parameter you want to
draw conclusion about.
p = proportion of adults who are in favor of the death
penalty for a person convicted of murder
Example – Page 313, #28
Step 2 – Choose the appropriate inference procedure.
Verify conditions for using selected procedure.
Use a one proportion z-interval
• Random sample – stated in the question.
• Population is at least 10(491) = 4910 adults
• Sampling distribution is approximately normal
npˆ  (491)(0.65)  320  5
nqˆ  (491)(0.35)  172  5
Example – Page 313, #28
Step 3 – Carry out the inference procedure.
pˆ  zα
0.95
0.025
0.025
1.96
z  0 1.96
2
ˆˆ
pq
n
0.65(0.35)
0.65  1.96
491
0.65  0.04
Example – Page 313, #28
Step 4 – Interpret you results in the context of the problem.
We 95% confident that the proportion of adults who are in
favor of the death penalty for a person convicted of murder
is between 61% and 69%.
Example – Page 313, #28
Using the TI
x
n
x
0.65 
491
x  0.65  491  319.15  320
pˆ 
0.61  p  0.69
61%  p  69%
Example – Page 313, #28
C. Can we safely conclude that the majority of adults are
in favor of this death penalty? Explain
Yes, since the interval in which we have 95%
confidence is entirely above 50%
Example – Page 314, #34
Sample size for Left-Handed Golfers. As a manufacturer of golf equipment,
the Spalding Corporation wants to estimate the proportion of golfers
who are left handed. (The company can use this information in planning
for the number of right-handed and left-handed sets golf clubs to make.)
How many golfers must be surveyed if we want 99% confidence that the
sample proportion has a margin of error of 0.025.
pˆ  0.50
A) Assume that there is no available information that
could used as estimate of p̂ .
2
qˆ  0.50
  1  0.99  0.01
Z 0.005  2.575
z   0.25
  2 
2.5752  0.25
n

 2652.25  2653
2
2
E
0.025
Example – Page 314, #34
B) Assume that we have an estimate of p̂ found from the
previous study that suggests that 15% of golfers are
left handed (based on a USA Today report).
2
z   pq
ˆˆ



n   22
E
2.5752 (0.15)(0.85)

 1352.64  1353
2
0.025
pˆ  0.15
qˆ  0.85
  1  0.99  0.01
Z 0.005  2.575
Example – Page 314, #34
C) Assume that instead of using randomly selected golfers, the
sample data are obtained by asking TV viewers of the golfing
channel to call an “800” phone number to report whether
they are left-handed or right-handed. How are the results
affected?
Self selected samples are not valid. It is not appropriate
to assume that those who respond will be representative of
the general population.
Lesson 6-3
Estimating a Population Mean: σ Known
Assumptions



Sample is a simple random sample
Values of the population standard deviation
σ is known
The population is normally distributed or
n >30.
Example – Page 327, #6
Verify the assumptions. Determine whether the given
conditions justify using the margin of error when finding
a confidence interval estimate of the population mean μ
The sample size is n = 5 and σ not known.
No, n is not greater than 30 and standard deviation is
not known.
Example – Page 327, #8
Verify the assumptions. Determine whether the given
conditions justify using the margin of error when finding
a confidence interval estimate of the population mean μ
The sample size is n = 9, σ not known and the original
population is normally distributed.
No, because σ not known.
Definitions



Estimator is a formula or process for using
sample data to estimate a population
parameter.
Estimate is a specific value or range of values
used to approximate a population parameter.
Point Estimate is a single value (or point) used
to approximate a population parameter.

The sample mean x is the best point estimate of
the population mean μ.
Confidence Interval


As we saw in Section 6-2, a confidence
interval is a range (or an interval) of values
used to estimate the true value of the
population parameter.
The confidence level gives us the success
rate of the procedure used to construct the
confidence interval.
Level of Confidence

As describe in Section 6-2, The confidence
level is often expressed as the probability
1 – α, where α is the complement of the
confidence level.


For a 0.95 (95%) confidence level, α = 0.05
For a 0.99 (99%) confidence level, α = 0.01
Margin of Error

Margin of Error is the maximum likely difference
observed between the sample mean x and
population mean μ, and is denoted by E.
E  z 2 

n
Confidence Interval Estimate
of the Population Mean μ
x E  x E
x  E, x  E
 xE
Distribution of Sample Means
with Known σ
α
2
α
E
E
μ0
zα
2
2
Example – Page 328, #10
Use the given confidence level and sample data to find the
margin of error and confidence interval for estimating the population
mean μ.
Ages of drivers occupying the passing lane while driving
25 mi/h with the left signal flashing: 99% confidence;
n = 50, x  80.5 years, and σ is known to be 4.6 years.
  1  0.99  0.01
n  50
x  80.5
  4.6
Example – Page 328, #10
  1  0.99  0.01
Z0.01 2  Z0.005  2.575
n  50
x  80.5
  4.6
invNorm(0.005,0,1)  2.575
Find the margin of error
E  Z 2 

n
 2.575 
4.6
50
 1.675  1.68 years
Example – Page 328, #10
  1  0.99  0.01
n  50
x  80.5
  4.6
E  1.675
Find the confidence interval
xE  xE
80.5  1.675    80.5  1.675
78.8 yr    82.2 yr
Example – Page 328, #10
Find the confidence interval using the TI
STAT/TESTS/7:ZInterval
  1  0.99  0.01
n  50
x  80.5
  4.6
E  1.675
Sample Size for Estimating
Mean μ
  Z 2    

n
E


2
When finding the sample size n, if the use of the formula
does not result in a whole number, always increase the
value of n to the next larger whole number.
Example – Page 238, #16
Use the given margin of error, confidence level, and
population standard deviation σ find the minimum sample
size required to estimate an unknown population mean μ
Margin of Error: $500, confidence level: 94%, σ = $9877
  1  .94  .06
Z.06 2  Z 0.03  1.88
invNorm(.03,0,1)  1.8807
  Z 2      1.88  9877   2
 
n
 1379.20  1380

E
500

 

2
Procedure for Constructing a
Confidence Interval for μ, when σ is
known




Identify the population of interest and the
parameter you want to draw conclusion about.
Choose the appropriate inference procedure. Verify
the conditions for using the selected procedure.
Carry out the inference.
Interpret your results in the context of the problem.
Example – Page 328, #22
The health of the bear population in Yellowstone National Park is
monitored by periodic measurements taken from anesthetized bears.
A sample of 54 bears has a mean weight of 182.9 lb. Assuming that σ
is known to be 121.8 lb, find a 99% confidence interval estimate of
the mean of the population of all such bear weights. What aspect of
this problem is not realistic?
It is unrealistic to know σ
Step 1 – Identify the population of interest and the
parameter you want to draw conclusion about.
µ = mean weight of bears in the Yellowstone
National Park.
n  54
x  182.9
  121.8
CI  .99
Example – Page 328, #22
Step 2 – Choose the appropriate inference procedure. Verify
conditions for using the selected procedure.
We will use a one-sample z-interval
• We are assuming that the sample was random
• The standard deviation of the population is known σ = 121.8
• Large sample n ≥ 30 the CLT tells us that the sampling
distribution is approximately normal since n = 54
Example – Page 328, #22
Step 3 – Carry out the inference procedure
x  zα
2
σ
n
121.8
 182.9  2.575
54
n  54
x  182.9
  121.8
CI  .99
0.99
140.2 lbs < μ < 225.6 lbs
0.005
2.575
0.005
z  0 2.575
Example – Page 328, #22
Step 4 – Interpret you results in the context problem.
We are 99% confident that the mean weight of bears in
Yellowstone National Park is between 140.2 lbs
and 225.6 lbs.
Lesson 6-4, Part 1
Estimating a Population mean: σ Not Known
Assumptions



Sample is a simple random sample
Values of the population standard
deviation σ is unknown
The population is normally distributed or
n > 30.
Student t Distribution

If the distribution of a population is essentially
normal, then the distribution of
x μ
t
s
n

is essentially a student t distribution for all
samples size n, and is used to find critical value
values denoted by tα/2
Student t Distribution

t-statistic is the same as the z-score


z
Represents the number of standard errors x is from the
population mean, μ.
The shape of the t-distribution depends on the sample
size, n
x 

n
Normally
Distributed
x 
z
s
n
Not Normally
Distributed
x 
t
s
n
Normally
Distributed
Student t distribution for
n = 3 and n = 12
t distribution is different for different samples sizes.
Important Properties of the
Student t Distribution




The Student t distribution has the same general
symmetric bell shape as the normal distribution, but it
reflects the greater variability (with wider distributions)
that is expected with small samples.
The Student t distribution has a mean of t = 0 (just as
the standard normal distribution has a mean of z = 0).
The standard deviation of the Student t distribution
varies with the sample size and is greater than 1 (unlike
the standard normal distribution, which has a σ = 1).
As the sample size n gets larger, the Student t
distribution gets closer to the normal distribution.
Degree of Freedom (df)

Degrees of Freedom (df) corresponds to the
number of samples values that can vary
after certain restrictions have been imposed
on all data values.
df  n  1
Margin of Error E for Estimate
of μ

Based on an unknown σ and a small simple
random sample from a normally distributed
population.
s
E  t 2 
n

where tα/2 has n – 1 degrees of freedom.
Confidence Interval Estimate of the
Population Mean μ withσ unknown
x E  x E
x  E, x  E
 xE
Example – Page 343, #2
A) Find the critical value z. (B) Find the critical value t (C) State the
neither the normal nor the t-distribution applies.
95%; n = 10; σ is unknown; population appears to be normally distributed.
  1  .95  0.05
0.95
df  n  1  10  1  9
t t0.05  t9,0.025
2
 2.262
2
Use table A-3
0.025
0.025
Example – Page 343, #2
Using TI
2nd Vars
Example – Page 343, #8
A) Find the critical value z. (B) Find the critical value t (C) State the
neither the normal nor the t-distribution applies.
98%; n = 37; σ is unknown; population appears to be normally distributed.
  1  .98  0.02
0.98
df  n  1  37  1  36
t t0.02  t36,0.01  2.434
2
2
Use table A-3
0.01
0.01
Example – Page 343, #10
Use the given confidence level and sample data to find a) the margin
of error and b) the confidence interval for the population mean μ.
Assume that the population has a normal distribution.
Elbow to fingertip length of mean: 99% confidence level,
n  32, x  14.50, s  0.70
  1  0.99  0.01
t.01  t31,0.005  2.744
x E  x E
14.50  0.34    14.50  0.34
14.16    14.84
2
s
E  t 2 
n
0.70
 2.744 
 0.34
32
Example – Page 343, #10
Find the confidence interval using the TI
STAT/TESTS/8:TInterval
CL  .99
n  32
x  14.50
s  0.70
Lesson 6-4, Part 2
Estimating a Population mean: σ Not Known
Procedure for Constructing a
Confidence Interval for μ, when σ is
Unknown




Identify the population of interest and the
parameter you want to draw conclusion about.
Choose the appropriate inference procedure. Verify
the conditions for using the selected procedure.
Carry out the inference.
Interpret your results in the context of the problem.
Example – Page 344, #14
A study was conducted to estimate hospital costs for accident victims
who wore seats belts. Twenty randomly selected cases have a
distribution that appears to be bell-shape with a mean of $9004 and
a standard deviation of $5629.
A) Construction the 99% confidence interval for the mean of all such
costs.
Step 1 – Identify the population of interest and the parameter
you want to draw conclusion about.
µ = mean costs of accident victims who wore seat belts.
Example – Page 344, #14
Step 2 – Choose the appropriate inference procedure. Verify
conditions for using the selected procedure.
We will use a one-sample t-interval for the mean
• Random Sample – Stated in the question
• Value of σ is unknown
• Question stated that the distribution appears to be
approximately normal
Example – Page 344, #14
Step 3 – Carry out the inference procedure
n  20, df  19, x  9004, s  5629,tα  2.861
2
x  tα
2
s
 5629 
 9004  2.861

 20 
n
$5403, $12,605
0.005
2.861
Example – Page 344, #14
Step 4 – Interpret your results in the context of the problem.
We are 99% confident that the mean costs of all
accidents victims who wear seat belts is between
$5403 and $12605
Example – Page 344, #14
B). If you are a manager for an insurance company that provides lower
rates for drivers who wear seat belts, and you want a
conservative estimate for a worst scenario, what amount should you
use as the possible hospital cost for an accident victim who wears
seat belts?
$12,605 is the high end estimate for the long-run
average hospital cost of such accident victims.
Example – Page 344, #18
Listed below are measured amounts of lead (in micrograms per cubic meter)
in the air. The Environmental Protection Agency has established an air
quality standard for lead: 1.5 μg/m³. The measurements shown below were
recorded at Building 5 of the World Trade Center site on different days
immediately following the destruction caused by the terrorist attacks of
September 11, 2001. After the collapse of the two World Trade Center
Buildings, there was considerable concern about the quality of the air. Use
the given values to construct a 95% confidence interval estimate of the mean
amount of lead in the air. Is there anything about this data set suggesting
that the confidence interval might not be very good? Explain.
5.40 1.10 0.42 0.73 0.48 1.10
Example – Page 344, #18
Step 1 – Identify the population of interest and the
parameter you want to draw conclusions about.
µ = mean amount of lead in the air at the world Trade Center
Example – Page 344, #18
Choose the appropriate inference procedure. Verify
conditions for using the selected procedure.
Use a one sample t-interval
• Measurements were not randomly selected, but its
representative sample.
• The value of σ is unknown
• The sampling distribution does not appear to be
approximately normal since the box plot is skewed right
with an outlier (see graph).
Example – Page 344, #18
Collection 1
Box Plot
0
1
2
3
4
5
6
Mean_Amt_of_Lead_at_the_World_Trade_Center
Example – Page 344, #18
Carry out the inference procedure.
n  6, df  5, x  1.538, s  1.914,tα  2.571
2
x  tα
2
s
 1.914 
 1.538  2.571

n
 6 
-0.471 < µ < 3.547
(micrograms/cubic meter)
0.025
2.571
Example – Page 344, #18
Step 4 – Interpret your results in the context of the problem.
We are 95% confident that the mean lead amount of all
air at the World Trade Center is between -0.4705 and
3.5472 (micrograms/cubic meter).
Yes, 4 of the 5 samples are below x raises a question
about whether the data meets the requirements that
underlying population distribution is normal.
Lesson 6-5
Estimating the Population Variance σ²
What is variance?


Is the difference between each observation
and the mean.
Since the mean represents the “center of
gravity,” the sum of all deviation about the
mean must equal zero.
Population Variance
Population variance (σ²) of a variable is the sum of the
squared deviations about the population mean divided
by the number of observation in the population (N)

2
x  


2
i
N
Population Standard Deviation    2
Assumptions


The sample is simple random sample
The population must have normally
distributed values (even if the sample is
large).
Chi-Square Distribution
(n  1)s
χ 
2
σ
2
2
n = sample size
s2 = sample variance
σ2 = population variance
Properties of the Distribution
of the Chi-Square Statistics


The chi-square distribution is not symmetric, unlike the
normal and Student t distribution.
As the number of degrees of freedom increases, the
distribution becomes more symmetric.
Properties of the Distribution
of the Chi-Square Statistics



The values of chi-square can be zero or positive, but they
cannot be negative.
The chi-square distribution is different for each number of
degrees of freedom, which is df = n – 1 in this section. As
the number increases, the chi-square distribution
approaches a normal distribution.
2
χ
In table A-4, each critical value of corresponds to an
area given in the top row of the table, and that area
represents the total region located to the right of the
critical value.
Chi-Square Distribution with
Critical values
Use Table A-4

2

Left


2
1
2
2
2
2
Right
Example – Page 355, #2
Find the critical values that correspond to the given
confidence level and sample size.
0.05
 0.025
95%; n  51
  1  .95  .05
2
0.025
The Area to the Right
2
 0.025
 71.420
The Area to the Left

2
0.975
 32.357
0.025
Area  0.95

2
10.0250.975

2
0.025
Estimators of σ2
The sample variance s² is the best point
estimate of the population variance σ²
Confidence Interval for the
Population Variance σ²
 n  1 s   2   n  1 s
2
2
2
 2
1 2
 n  1 s     n  1 s
2
2
2
 2
1 2
2
2
Example – Page 355, #6
Find the confidence interval. Use the given confidence level and sample
data to find a confidence interval for the population standard deviation.
In each case assume that a simple random sample has been selected
from population that has a normal distribution.
Ages of drivers occupying the passing lane while driving 25 mi/h with
the left signal flashing: 99% confidence; n = 27, x = 80.5 years,
s = 4.6 years
  1  0.99  .01
0.01
 0.005
2
 n  1 s 2     n  1 s 2
2
2
 2
1 2
Example – Page 355, #6
  1  0.99  .01
0.01
 0.005
2
2
 0.005
 48.290
2
 0.995
 11.160
 n  1 s 2     n  1 s 2
2
2
 2
1 2
 27  1 4.62     27  1 4.62
48.290
11.160
3.4 years    7.0 years
Procedure for Constructing a
Confidence Interval for σ




Identify the population of interest and the
parameter you want to draw conclusion about.
Choose the appropriate inference procedure. Verify
the conditions for using the selected procedure.
Carry out the inference.
Interpret your results in the context of the problem.
Example – Page 356, #14
A container of car antifreeze is supposed to hold 3785 mL of the liquid.
Realizing that fluctuations are inevitable, the quality-control manager
wants to be quite sure that the standard deviation is less than 30 mL.
Otherwise, some containers would overflow while others would not have
enough of the coolant. She selects a simple random sample, with the
results given here. Use these sample results to construct the 99%
confidence interval for the true value of σ. Does this confidence
interval suggest that the fluctuations are at an acceptable level?
3761 3861 3769 3772 3675 3861
3888 3819 3788 3800 3720 3748
3753 3821 3811 3740 3740 3839
n  18
x  3787.0
s  55.4
Example – Page 356, #14
Step 1 – Identify the population of interest and the parameter
you want to draw conclusions about.
σ = standard deviation of car antifreeze.
Step 2 – Choose the appropriate inference procedure. Verify
conditions for using selected procedure.
Use a chi-square interval
Conditions
 Question stated SRS
 Since the histogram is approximately normal.
Example – Page 356, #14
Example – Page 356, #14
Step 3 – Carry out the inference procedure
n  18
  1  .99  .01
x  3787.0
 0.01
 n  1 s 2     n  1 s 2
2
2
2
18  1 55.42    18  1 55.42
s  55.4
CL  99%
2
.005
 35.718

2
0.995
 5.697

2
 .005
 2
35.718
1 2
5.697
38.2mL    95.7mL
Example – Page 356, #14
Step 4 – Interpret your results in the context of the problem.
We are 99% confident that the standard deviation of
car antifreeze is between 38.2 ml and 95.7 ml.
No, the interval indicates 99% confidence that σ > 30 mL
(the fluctuations appears to be too high).