Download Confidence Intervals Confidence Intervals for the Mean (Large

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Degrees of freedom (statistics) wikipedia , lookup

Foundations of statistics wikipedia , lookup

Taylor's law wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

History of statistics wikipedia , lookup

Resampling (statistics) wikipedia , lookup

German tank problem wikipedia , lookup

Student's t-test wikipedia , lookup

Misuse of statistics wikipedia , lookup

Transcript
Chapter 6
Confidence Intervals
§ 6.1
Confidence Intervals for
the Mean (Large
Samples)
Point Estimate for Population µ
A point estimate is a single value estimate for a population
parameter. The most unbiased point estimate of the population
x.
mean, µ, is the sample mean,
Example:
A random sample of 32 textbook prices (rounded to the nearest
dollar) is taken from a local college bookstore. Find a point
estimate for the population mean, µ.
34
56
79
94
34
65
86
95
38
65
87
96
45
66
87
98
45
67
87
98
45
67
88
101
45
68
90
110
54
74
90
121
x ≈ 74.22
The point estimate for the population mean of textbooks in the
bookstore is $74.22.
Larson & Farber, Elementary Statistics: Picturing the World, 3e
3
1
Interval Estimate
An interval estimate is an interval, or range of values, used to
estimate a population parameter.
Point estimate for
textbooks
•
74.22
interval estimate
How confident do we want to be that the interval estimate contains
the population mean, µ?
Larson & Farber, Elementary Statistics: Picturing the World, 3e
4
Level of Confidence
The level of confidence c is the probability that the interval
estimate contains the population parameter.
c is the area beneath the
c normal curve between the
critical values.
1
(1 – c)
2
1
(1 – c)
2
−zc
z=0
z
zc
Use the Standard Normal
Table to find the
corresponding z-scores.
Critical values
The remaining area in the tails is 1 – c .
Larson & Farber, Elementary Statistics: Picturing the World, 3e
5
Common Levels of Confidence
If the level of confidence is 90%, this means that we are 90%
confident that the interval contains the population mean, µ.
0.90
0.05
0.05
z
−zc = −−1.645
zc
z=0
zc = z1.645
c
The corresponding z-scores are ± 1.645.
Larson & Farber, Elementary Statistics: Picturing the World, 3e
6
2
Common Levels of Confidence
If the level of confidence is 95%, this means that we are 95%
confident that the interval contains the population mean, µ.
0.95
0.025
0.025
z
−zc = − z1.96
c
z=0
zc =z1.96
c
The corresponding z-scores are ± 1.96.
Larson & Farber, Elementary Statistics: Picturing the World, 3e
7
Common Levels of Confidence
If the level of confidence is 99%, this means that we are 99%
confident that the interval contains the population mean, µ.
0.99
0.005
0.005
z
−zc = −−2.575
zc
z=0
zc = z2.575
c
The corresponding z-scores are ± 2.575.
Larson & Farber, Elementary Statistics: Picturing the World, 3e
8
Margin of Error
The difference between the point estimate and the actual population
parameter value is called the sampling error.
When µ is estimated, the sampling error is the difference µ – .
Sincex µ is usually unknown, the maximum value for the error can be
calculated using the level of confidence.
Given a level of confidence, the margin of error (sometimes called
the maximum error of estimate or error tolerance) E is the greatest
possible distance between the point estimate and the value of the
parameter it is estimating.
σ
E = z cσ x = z c
n When n ≥ 30, the sample standard
deviation, s, can be used for σ.
Larson & Farber, Elementary Statistics: Picturing the World, 3e
9
3
Margin of Error
Example:
A random sample of 32 textbook prices is taken from a local college
bookstore. The mean of the sample is = 74.22, and the sample
x
standard deviation is s = 23.44.
Use a 95% confidence level and find the margin of error for the mean
price of all textbooks in the bookstore.
E = zc
σ
23.44
= 1.96 ⋅
n
32
Since n ≥ 30, s can be
substituted for σ.
≈ 8.12
We are 95% confident that the margin of error for the population
mean (all the textbooks in the bookstore) is about $8.12.
Larson & Farber, Elementary Statistics: Picturing the World, 3e
10
Confidence Intervals for µ
A c-confidence interval for the population mean µ is
x−E <µ < x+E
The probability that the confidence interval contains µ is c.
Example:
A random sample of 32 textbook prices is taken from a local
college bookstore. The mean of the sample is = 74.22, the
x sample
standard deviation is s = 23.44, and the margin of error is E = 8.12.
Construct a 95% confidence interval for the mean price of all
textbooks in the bookstore.
Continued.
Larson & Farber, Elementary Statistics: Picturing the World, 3e
11
Confidence Intervals for µ
Example continued:
Construct a 95% confidence interval for the mean price of all
textbooks in the bookstore.
E = 8.12
x = 7 4.22 s = 23.44
Left endpoint = ?
•
x − E = 74.22 − 8.12
= 66.1
Right endpoint = ?
x =•74 .22
•
x + E = 74.22 + 8.12
= 82.34
With 95% confidence we can say that the cost for all textbooks in
the bookstore is between $66.10 and $82.34.
Larson & Farber, Elementary Statistics: Picturing the World, 3e
12
4
Finding Confidence Intervals for µ
(n ≥
Finding a Confidence Interval for a Population Mean
30 or σ known with a normally distributed population)
In Words
In Symbols
1. Find the sample statistics n and
x.
x =
2. Specify σ, if known. Otherwise, if n ≥ 30, find
the sample standard deviation s and use it as an
estimate for σ.
3. Find the critical value zc that corresponds to the
given level of confidence.
s =
∑x
n
∑( x − x )2
n −1
Use the Standard
Normal Table.
E = zc
4. Find the margin of error E.
σ
n
Left en dpoin t : x − E
5. Find the left and right endpoints
the confidence interval.
and form
Righ t en dpoin t : x + E
In t er va l: x − E < µ < x + E
Larson & Farber, Elementary Statistics: Picturing the World, 3e
13
Confidence Intervals for µ (σ Known)
Example:
A random sample of 25 students had a grade point average with a
mean of 2.86. Past studies have shown that the standard deviation
is 0.15 and the population is normally distributed.
Construct a 90% confidence interval for the population mean grade
point average.
x = 2 .8 6
σ = 0.15
σ
0.15
=
=
1.645
⋅
E
z
≈ 0.05
zc = 1.645
c
n
25
2.81 < µ < 2.91
x + E = 2 .8 6 ± 0 .0 5
n = 25
With 90% confidence we can say that the mean grade point
students in the population is between 2.81 and 2.91.
average for all
Larson & Farber, Elementary Statistics: Picturing the World, 3e
14
Sample Size
Given a c-confidence level and a maximum error of estimate, E,
the minimum sample size n, needed to estimate µ, the population
mean, is
2
zσ 
n= c  .
 E 
If σ is unknown, you can estimate it using s provided you have a
preliminary sample with at least 30 members.
Example:
You want to estimate the mean price of all the textbooks in the
college bookstore. How many books must be included in your
sample if you want to be 99% confident that the sample mean is
within $5 of the population mean?
Continued.
Larson & Farber, Elementary Statistics: Picturing the World, 3e
15
5
Sample Size
Example continued:
You want to estimate the mean price of all the textbooks in the
college bookstore. How many books must be included in your
sample if you want to be 99% confident that the sample mean is
within $5 of the population mean?
x = 7 4 .2 2
σ ≈ s = 23.44
2
zc = 2.575
2
 z σ   2.575 ⋅ 23.44 
n= c  =

5

 E  
≈ 145.7 (Always round up.)
You should include at least 146 books in your sample.
Larson & Farber, Elementary Statistics: Picturing the World, 3e
16
§ 6.2
Confidence Intervals for
the Mean (Small
Samples)
The t-Distribution
When a sample size is less than 30, and the random variable x is
approximately normally distributed, it follow a t-distribution.
t =
x −µ
s
n
Properties of the t-distribution
1.
The t-distribution is bell shaped and symmetric about the mean.
2.
The t-distribution is a family of curves, each determined by a parameter
called the degrees of freedom. The degrees of freedom are the number of
free choices left after a sample statistic such as is calculated. When you usex
a t-distribution to estimate a population mean, the degrees of freedom are
equal to one less than the sample size.
d.f. = n – 1
Degrees of freedom
Continued.
Larson & Farber, Elementary Statistics: Picturing the World, 3e
18
6
The t-Distribution
3.
The total area under a t-curve is 1 or 100%.
4.
The mean, median, and mode of the t-distribution are equal to zero.
5.
As the degrees of freedom increase, the t-distribution approaches the normal
distribution. After 30 d.f., the t-distribution is very close to the standard normal
z-distribution.
The tails in the t-distribution are
“thicker” than those in the standard
normal distribution.
d.f. = 2
d.f. = 5
t
0
Standard normal
curve
Larson & Farber, Elementary Statistics: Picturing the World, 3e
19
Critical Values of t
Example:
Find the critical value tc for a 95% confidence when the sample
size is 5.
Appendix B: Table 5: t-Distribution
d.f.
1
2
3
4
5
Level of
confidence, c
One tail, α
Two tails, α
0.50
0.25
0.50
1.000
.816
.765
.741
.727
0.80
0.10
0.20
3.078
1.886
1.638
1.533
1.476
0.90
0.05
0.10
6.314
2.920
2.353
2.132
2.015
0.95
0.025
0.05
12.706
4.303
3.182
2.776
2.571
d.f. = n – 1 = 5 – 1 = 4
tc = 2.776
c = 0.95
0.98
0.01
0.02
31.821
6.965
4.541
3.747
3.365
Continued.
Larson & Farber, Elementary Statistics: Picturing the World, 3e
20
Critical Values of t
Example continued:
Find the critical value tc for a 95% confidence when the sample
size is 5.
95% of the area under the t-distribution curve with 4 degrees of
freedom lies between t = ±2.776.
c = 0.95
t
−tc = − 2.776
tc = 2.776
Larson & Farber, Elementary Statistics: Picturing the World, 3e
21
7
Confidence Intervals and t-Distributions
Constructing a Confidence Interval for the Mean:
Distribution
In Words
In Symbols
x,
1. Identify the sample statistics n,
and s.
x =
2. Identify the degrees of freedom, the
level of confidence c, and the critical
value tc.
∑x
n
s =
t-
∑( x − x )2
n −1
d.f. = n – 1
E = tc
3. Find the margin of error E.
4. Find the left and right endpoints and
form the confidence interval.
s
n
Left en dpoin t : x − E
Righ t en dpoin t : x + E
In t er va l: x − E < µ < x + E
Larson & Farber, Elementary Statistics: Picturing the World, 3e
22
Constructing a Confidence Interval
Example:
In a random sample of 20 customers at a local fast food restaurant,
the mean waiting time to order is 95 seconds, and the standard
deviation is 21 seconds. Assume the wait times are normally
distributed and construct a 90% confidence interval for the mean wait
time of all customers.
n = 20
x = 95
d.f. = 19
tc = 1.729
s = 21
x ± E = 9 5 ± 8 .1
E = tc
s = 1.729 ⋅ 21 = 8.1
n
20
86.9 < µ < 103.1
We are 90% confident that the mean wait time for all customers is
between 86.9 and 103.1 seconds.
Larson & Farber, Elementary Statistics: Picturing the World, 3e
23
Normal or t-Distribution?
Use the normal distribution with
Is n ≥ 30?
Yes
E = zc
σ
.
n
If σ is unknown, use s instead.
No
Is the population normally, or
approximately normally,
distributed?
No
You cannot use the normal distribution
or the t-distribution.
Yes
Use the normal distribution with
Is σ known?
Yes
E = zc
σ
.
n
No
Use the t-distribution with
E =tc
s
n
and n – 1 degrees of freedom.
Larson & Farber, Elementary Statistics: Picturing the World, 3e
24
8
Normal or t-Distribution?
Example:
Determine whether to use the normal distribution, the
distribution, or neither.
t-
a.) n = 50, the distribution is skewed, s = 2.5
The normal distribution would be used because the sample size is
50.
b.) n = 25, the distribution is skewed, s = 52.9
Neither distribution would be used because n < 30 and the
distribution is skewed.
c.) n = 25, the distribution is normal, σ = 4.12
The normal distribution would be used because although n < 30,
the population standard deviation is known.
Larson & Farber, Elementary Statistics: Picturing the World, 3e
25
§ 6.3
Confidence Intervals for
Population Proportions
Point Estimate for Population p
The probability of success in a single trial of a binomial experiment
is p. This probability is a population proportion.
The point estimate for p, the population proportion of successes, is
given by the proportion of successes in a sample and is denoted by
x
pˆ =
n
of successes in the sample and n is the
where x is the number
number in the sample. The point estimate for the proportion of
failures is = 1 –
The symbols and
are read as “p hat” and
“q hat.”
qˆ
p̂ .
p̂
qˆ
Larson & Farber, Elementary Statistics: Picturing the World, 3e
27
9
Point Estimate for Population p
Example:
In a survey of 1250 US adults, 450 of them said that their favorite
sport to watch is baseball. Find a point estimate for the population
proportion of US adults who say their favorite sport to watch is
baseball.
n = 1250
pˆ =
x = 450
x
450
=
= 0.36
n 1250
The point estimate for the proportion of US adults who say
baseball is their favorite sport to watch is 0.36, or 36%.
Larson & Farber, Elementary Statistics: Picturing the World, 3e
28
Confidence Intervals for p
A c-confidence interval for the population proportion p is
pˆ − E < p < pˆ + E
where
pq
ˆ ˆ.
n
E = zc
The probability that the confidence interval contains p is c.
Example:
Construct a 90% confidence interval for the proportion of US adults
who say baseball is their favorite sport to watch.
n = 1250
x = 450
p̂ = 0.36
Continued.
Larson & Farber, Elementary Statistics: Picturing the World, 3e
29
Confidence Intervals for p
Example continued:
n = 1250
x = 450
p̂ = 0.36
E = zc
qˆ = 0.64
Left endpoint = ?
•
p̂ − E = 0.36 − 0.022
= 0.338
pq
ˆˆ
n
= 1.645
(0.36)(0.64) ≈ 0.022
1250
Right endpoint = ?
•
p̂ = 0.36
•
p̂ + E = 0.36 + 0.022
= 0.382
With 90% confidence we can say that the proportion of all US
adults who say baseball is their favorite sport to watch is between
33.8% and 38.2%.
Larson & Farber, Elementary Statistics: Picturing the World, 3e
30
10
Finding Confidence Intervals for p
Constructing a Confidence Interval for a Population Proportion
In Words
In Symbols
1. Identify the sample statistics n and x.
x
pˆ =
2. Find the point estimate
p̂ .
n
3. Verify that the sampling distribution can be n pˆ ≥ 5, n qˆ ≥ 5
approximated by the normal distribution.
4. Find the critical value zc that corresponds to
the given level of confidence.
Use the Standard
Normal Table.
5. Find the margin of error E.
6. Find the left and right endpoints and form
the confidence interval.
E = zc
pq
ˆˆ
n
Left endpoint: p̂ − E
Right endpoint: Interval:
p̂ + E
pˆ − E < p < pˆ + E
Larson & Farber, Elementary Statistics: Picturing the World, 3e
31
Sample Size
Given a c-confidence level and a margin of error, E, the minimum
sample size n, needed to estimate p is
2
z 
ˆˆ c  .
n = pq
E
This formula assumes you have an estimate for
If not, use pˆ = 0.5
and
qˆ = 0.5.
qˆ.
and p̂
Example:
You wish to find out, with 95% confidence and within 2% of the true
population, the proportion of US adults who say that baseball is their
favorite sport to watch.
Continued.
Larson & Farber, Elementary Statistics: Picturing the World, 3e
32
Sample Size
Example continued:
You wish to find out, with 95% confidence and within 2% of the true
population, the proportion of US adults who say that baseball is their
favorite sport to watch.
n = 1250
x = 450
2
p̂ = 0.36
1.96 
z 
ˆ ˆ  c  = (0.36)(0.64) 
n = pq

 0.02 
E
≈ 2212.8
2
(Always round up.)
You should sample at least 2213 adults to be 95% confident.
Larson & Farber, Elementary Statistics: Picturing the World, 3e
33
11
§ 6.4
Confidence Intervals for
Variance and Standard
Deviation
The Chi-Square Distribution
The point estimate for σ2 is s2, and the point estimate for σ is s. s2
is the most unbiased estimate for σ2.
You can use the chi-square distribution to construct a confidence
interval for the variance and standard deviation.
If the random variable x has a normal distribution, then the
distribution of
s2
χ 2 = (n − 1)
2
σ
forms a chi-square distribution for samples of any size
n > 1.
Larson & Farber, Elementary Statistics: Picturing the World, 3e
35
The Chi-Square Distribution
Four properties of the chi-square distribution are as follows.
1. All chi-square values χ2 are greater than or equal to zero.
2. The chi-square distribution is a family of curves, each determined by
the degrees of freedom. To form a confidence interval for σ2, use the
χ2-distribution with degrees of freedom. To form a confidence
interval for σ2, use the
χ2-distribution with degrees of freedom
equal to one less than the sample size.
3. The area under each curve of the chi-square distribution equals one.
4. Find the critical value zc that corresponds to the given level of
confidence.
5. Chi-square distributions are positively skewed.
Larson & Farber, Elementary Statistics: Picturing the World, 3e
36
12
Critical Values for X2
There are two critical values for each level of confidence. The value
χ2R represents the right-tail critical value and χ2L represents the lefttail critical value.
1−
1 −c
2
X2
X 2R
(1 −2 c ) = 1 +2 c
X2
X 2L
Area to the right of X 2R
Area to the right of X 2L
c
1 −c
2
1 −c
2
X 2R
X 2L
The area between the left and right
critical values is c.
X2
Larson & Farber, Elementary Statistics: Picturing the World, 3e
37
Critical Values for X2
Example:
Find the critical values χ2R and χ2L for an 80% confidence when the
sample size is 18.
Because the sample size is 18, there are
= 18 – 1 = 17 degrees of freedom,
d.f. = n – 1
Area to the right of χ2R =
1 − c 1 − 0.8
=
= 0.1
2
2
Area to the right of χ2L =
1 + c 1 + 0.8
=
= 0.9
2
2
Use the Chi-square distribution table to find the critical values.
Continued.
Larson & Farber, Elementary Statistics: Picturing the World, 3e
38
Critical Values for X2
Example continued:
Appendix B: Table 6: χ2-Distribution
Degrees of
freedom
1
2
3
16
17
18
0.995
0.99
0.010
0.072
0.020
0.115
0.975
0.001
0.051
0.216
5.142
5.697
6.265
5.812
6.408
7.015
6.908
7.564
8.231
α
0.95
0.004
0.103
0.352
0.90
0.016
0.211
0.584
0.10
2.706
4.605
6.251
0.05
3.841
5.991
7.815
7.962
8.672
9.390
9.312
10.085
10.865
23.542
24.769
25.989
26.296
27.587
28.869
χ2R = 24.769
χ2L = 10.085
Larson & Farber, Elementary Statistics: Picturing the World, 3e
39
13
Confidence Intervals for σ2 and σ
A c-confidence interval for a population variance and standard
deviation is as follows.
Confidence Interval for σ2:
(n − 1)s 2
(n − 1)s 2
<σ2 <
X R2
X L2
Confidence Interval for σ:
(n − 1)s 2
(n − 1)s 2
<σ <
X R2
X L2
The probability that the confidence intervals contain σ2 or σ is
c.
Larson & Farber, Elementary Statistics: Picturing the World, 3e
40
Confidence Intervals for σ2 and σ
Constructing a Confidence Interval for a Variance and a Standard
Deviation
In Words
1. Verify that the population has a normal
distribution.
In Symbols
2. Identify the sample statistic n and the
degrees of freedom.
d.f. = n − 1
3. Find the point estimate s2.
s2 =
χ2
χ2
4. Find the critical value R and L that
correspond to the given level of
confidence.
∑(x − x )2
n −1
Use Table 6 in
Appendix B.
Continued.
Larson & Farber, Elementary Statistics: Picturing the World, 3e
41
Confidence Intervals for σ2 and σ
Constructing a Confidence Interval for a Variance and a Standard
Deviation
In Words
5. Find the left and right endpoints and
form the confidence interval.
6. Find the confidence interval for the
population standard deviation by taking
the square root of each endpoint.
In Symbols
(n − 1)s 2
(n − 1)s 2
<σ2 <
X R2
X L2
(n − 1)s 2
(n − 1)s 2
<σ <
X R2
X L2
Larson & Farber, Elementary Statistics: Picturing the World, 3e
42
14
Constructing a Confidence Interval
Example:
You randomly select and weigh 41 samples of 16-ounce bags of
potato chips. The sample standard deviation is 0.05 ounce.
Assuming the weights are normally distributed, construct a 90%
confidence interval for the population standard deviation.
d.f. = n – 1 = 41 – 1 = 40 degrees of freedom,
Area to the right of χ2R =
Area to the right of χ2L =
1 − c 1 − 0.9
=
= 0.05
2
2
1 + c 1 + 0.9
=
= 0.95
2
2
The critical values are χ2R = 55.758 and χ2L = 26.509.
Continued.
Larson & Farber, Elementary Statistics: Picturing the World, 3e
43
Constructing a Confidence Interval
Example continued:
χ2L = 26.509
χ2R = 55.758
Left endpoint = ?
Right endpoint = ?
•
(n − 1)s 2
χ R2
=
•
(41 − 1)(0.05)2
55.758
≈ 0.04
(n − 1)s 2
χ L2
=
(41 − 1)(0.05)2
26.509
≈ 0.06
0.04 < σ < 0.06
With 90% confidence we can say that the population standard
deviation is between 0.04 and 0.06 ounces.
Larson & Farber, Elementary Statistics: Picturing the World, 3e
44
15