Download z - Mater Academy Charter Middle/ High

Document related concepts

History of statistics wikipedia , lookup

Degrees of freedom (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Misuse of statistics wikipedia , lookup

German tank problem wikipedia , lookup

Transcript
Chapter 6
CONFIDENCE INTERVALS
1
Section 6.1
CONFIDENCE INTERVALS FOR THE MEAN (LARGE SAMPLES)
2
Point Estimate for Population μ
Point Estimate
A single value estimate for a population parameter
Most unbiased point estimate of the population mean μ is the
sample mean x
Estimate Population with Sample
Parameter…
Statistic
Mean: μ
x
3
Example: Point Estimate for Population μ
Market researchers use the number of sentences per advertisement
as a measure of readability for magazine advertisements. The
following represents a random sample of the number of sentences
found in 50 advertisements. Find a point estimate of the
population mean, . (Source: Journal of Advertising Research)
9 20 18 16 9 9 11 13 22 16 5 18 6 6 5 12 25
17 23 7 10 9 10 10 5 11 18 18 9 9 17 13 11 7
14 6 11 12 11 6 12 14 11 9 18 12 12 17 11 20
4
Solution: Point Estimate for Population μ
The sample mean of the data is
x 620
x

 12.4
n
50
Your point estimate for the mean length of all magazine
advertisements is 12.4 sentences.
5
Interval Estimate
Interval estimate
An interval, or range of values, used to estimate a population
parameter.
Point estimate
(
12.4
•
)
Interval estimate
How confident do we want to be that the interval
estimate contains the population mean μ?
6
Level of Confidence
Level of confidence c
The probability that the interval estimate contains the population parameter.
c
c is the area under the
standard normal curve
between the critical values.
½(1 – c)
½(1 – c)
-zc
z=0
zc
Critical values
The remaining area in the tails is 1 – c .
7
z
Use the Standard
Normal Table to find the
corresponding z-scores.
Level of Confidence
If the level of confidence is 90%, this means that we are 90%
confident that the interval contains the population mean μ.
c = 0.90
½(1 – c) = 0.05
½(1 – c) = 0.05
zc
-zc = -1.645
zc =zc1.645
z=0
The corresponding z-scores are +1.645.
8
z
Sampling Error
Sampling error
The difference between the point estimate and
the actual population parameter value.
For μ:
◦ the sampling error is the difference x – μ
◦ μ is generally unknown
◦ x varies from sample to sample
9
Margin
of
Error
Margin of error
The greatest possible distance between the point estimate and
the value of the parameter it is estimating for a given level of
confidence, c.
Denoted by E.
σ
E  zcσ x  zc
When n  30, the sample
standard deviation, s, can
be used for .
n
Sometimes called the maximum error of estimate or error
tolerance.
10
Example: Finding the Margin of Error
Use the magazine advertisement data and a 95%
confidence level to find the margin of error for the
mean number of sentences in all magazine
advertisements. Assume the sample standard
deviation is about 5.0.
11
Solution: Finding the Margin of Error
First find the critical values
0.95
0.025
0.025
zc
-zc = -1.96
z=0
zczc= 1.96
95% of the area under the standard normal curve
falls within 1.96 standard deviations of the mean.
(You can approximate the distribution of the sample
means with a normal curve by the Central Limit
Theorem, because n ≥ 30.)
12
z
Solution: Finding the Margin of Error
E  zc

n
 1.96 
 zc
You don’t know σ, but
since n ≥ 30, you can
use s in place of σ.
s
n
5.0
50
 1.4
You are 95% confident that the margin of error for
the population mean is about 1.4 sentences.
13
Confidence Intervals for the Population
Mean
A c-confidence interval for the population mean μ
• x E   x E
where E  zc

n
•The probability that the confidence interval contains μ is c.
14
Constructing Confidence Intervals for μ
Finding a Confidence Interval for a Population Mean
(n  30 or σ known with a normally distributed population)
In Words
In Symbols
1. Find the sample statistics n and
x.
x
x
n
2. Specify , if known.
Otherwise, if n  30, find the
sample standard deviation s and
use it as an estimate for .
(x  x )2
s
n 1
15
Constructing Confidence Intervals for μ
In Words
3. Find the critical value zc that
corresponds to the given
level of confidence.
In Symbols
Use the Standard
Normal Table.
E  zc
4. Find the margin of error E.

n
Left endpoint: x  E
Right endpoint: x  E
Interval: x  E    x  E
5. Find the left and right
endpoints and form the
confidence interval.
16
Example: Constructing a Confidence
Interval
Construct a 95% confidence interval for the mean number of
sentences in all magazine advertisements.
Solution: Recall x  12.4 and E = 1.4
Left Endpoint:
xE
 12.4  1.4
Right Endpoint:
xE
 12.4  1.4
 13.8
 11.0
11.0 < μ < 13.8
17
Solution: Constructing a Confidence
Interval
11.0 < μ < 13.8
11.0
(
12.4 13.8
•
)
With 95% confidence, you can say that the
population mean number of sentences is between
11.0 and 13.8.
18
Example: Constructing a Confidence
Interval σ Known
A college admissions director wishes to estimate the mean age of
all students currently enrolled. In a random sample of 20 students,
the mean age is found to be 22.9 years. From past studies, the
standard deviation is known to be 1.5 years, and the population is
normally distributed. Construct a 90% confidence interval of the
population mean age.
19
Solution: Constructing a Confidence
Interval σ Known
First find the critical values
c = 0.90
½(1 – c) = 0.05
½(1 – c) = 0.05
zc
-zc = -1.645
z=0
zc = 1.645
20
zc =zc1.645
z
Solution: Constructing a Confidence
Interval σ Known
•
Margin of error:
E  zc
•

n
 1.645 
1.5
20
 0.6
Confidence interval:
Left Endpoint:
xE
 22.9  0.6
Right Endpoint:
xE
 22.9  0.6
 23.5
 22.3
22.3 < μ < 23.5
21
Solution: Constructing a Confidence
Interval σ Known
22.3 < μ < 23.5
22.3
(
x E
Point estimate
22.9
23.5
x
xE
•
)
With 90% confidence, you can say that the mean age
of all the students is between 22.3 and 23.5 years.
22
Sample Size
Given a c-confidence level and a margin of error E, the
minimum sample size n needed to estimate the
population mean  is
2
 zc 
n

 E 
If  is unknown, you can estimate it using s provided
you have a preliminary sample with at least 30
members.
23
Example: Sample Size
You want to estimate the mean number of sentences in a
magazine advertisement. How many magazine advertisements
must be included in the sample if you want to be 95% confident
that the sample mean is within one sentence of the population
mean? Assume the sample standard deviation is about 5.0.
24
Solution: Sample Size
First find the critical values
0.95
0.025
0.025
zc
-zc = -1.96
z=0
zc = 1.96
25
zczc= 1.96
z
Solution: Sample Size
zc = 1.96
  s = 5.0
E=1
 zc   1.96  5.0 
n

  96.04

1

 E  
2
2
When necessary, round up to obtain a whole number.
You should include at least 97 magazine advertisements
in your sample.
26
Section 6.2
CONFIDENCE INTERVALS FOR THE MEAN (SMALL SAMPLES)
27
The t-Distribution
When the population standard deviation is unknown, the sample
size is less than 30, and the random variable x is approximately
normally distributed, it follows a t-distribution.
x -
t
s
n
Critical values of t are denoted by tc.
28
Properties of the t-Distribution
1. The t-distribution is bell shaped and symmetric about
the mean.
2. The t-distribution is a family of curves, each
determined by a parameter called the degrees of
freedom. The degrees of freedom are the number of
free choices left after a sample statistic such as x is
calculated. When you use a t-distribution to
estimate a population mean, the degrees of freedom
are equal to one less than the sample size.
◦
d.f. = n – 1
Degrees of freedom
29
Properties of the t-Distribution
3. The total area under a t-curve is 1 or 100%.
4. The mean, median, and mode of the t-distribution are
equal to zero.
5. As the degrees of freedom increase, the t-distribution
approaches the normal distribution. After 30 d.f., the tdistribution is very close to the standard normal zdistribution.
Standard normal curve
d.f. = 2
d.f. = 5
t
0
30
The tails in the tdistribution are “thicker”
than those in the standard
normal distribution.
Example: Critical Values of t
Find the critical value tc for a 95% confidence when the sample
size is 15.
Solution: d.f. = n – 1 = 15 – 1 = 14
Table 5: t-Distribution
tc = 2.145
31
Solution: Critical Values of t
95% of the area under the t-distribution curve with 14 degrees of
freedom lies between t = +2.145.
c = 0.95
t
-tc = -2.145
tc = 2.145
32
Confidence Intervals for the Population
Mean
A c-confidence interval for the population mean μ
s
• x  E    x  E where E  tc
n
•The probability that the confidence interval contains μ is c.
33
Confidence Intervals and t-Distributions
In Words
In Symbols
x
(x  x )2
x
s
n 1
n
1. Identify the sample
statistics n, x , and s.
2. Identify the degrees of
freedom, the level of
confidence c, and the
critical value tc.
d.f. = n – 1
3. Find the margin of error E.
34
E  tc
s
n
Confidence Intervals and t-Distributions
In Words
In Symbols
Left endpoint: x  E
Right endpoint: x  E
Interval:
xE  xE
4. Find the left and right
endpoints and form the
confidence interval.
35
Example: Constructing a Confidence
Interval
You randomly select 16 coffee shops and measure the temperature of the coffee
sold at each. The sample mean temperature is 162.0ºF with a sample standard
deviation of 10.0ºF. Find the 95% confidence interval for the mean temperature.
Assume the temperatures are approximately normally distributed.
Solution:
Use the t-distribution (n < 30, σ is unknown,
temperatures are approximately distributed.)
36
Solution: Constructing a Confidence
Interval
•
n =16, x = 162.0 s = 10.0 c = 0.95
•
df = n – 1 = 16 – 1 = 15
•
Critical Value
Table 5: t-Distribution
tc = 2.131
37
Solution: Constructing a Confidence
Interval
•
Margin of error:
E  tc
•
s  2.131 10  5.3
n
16
Confidence interval:
Left Endpoint:
Right Endpoint:
 156.7
 167.3
xE
 162  5.3
xE
 162  5.3
156.7 < μ < 167.3
38
Solution: Constructing a Confidence
Interval
156.7 < μ < 167.3
156.7
(
x E
Point estimate
162.0
•x
167.3
)
xE
With 95% confidence, you can say that the mean
temperature of coffee sold is between 156.7ºF and
167.3ºF.
39
Normal or t-Distribution?
Is n  30?
Yes
No
Is the population normally,
or approximately normally,
distributed?
Use the normal distribution with
σ
E  zc
n
If  is unknown, use s instead.
No
Cannot use the normal
distribution or the t-distribution.
Yes
Use the normal distribution
with E  z σ
Yes
Is  known?
No
c
Use the t-distribution with
E  tc
s
n
and n – 1 degrees of freedom.
40
n
Section 6.3
CONFIDENCE INTERVALS FOR POPULATION PROPORTIONS
41
Point Estimate for Population p
Population Proportion
The probability of success in a single trial of a binomial experiment.
Denoted by p
Point Estimate for p
The proportion of successes in a sample.
Denoted by
◦
◦ read as “p hat”
pˆ 
x number of successes in sample

n
number in sample
42
Point Estimate for Population p
Estimate Population with Sample
Parameter…
Statistic
p̂
Proportion: p
Point Estimate for q, the proportion of failures
Denoted by qˆ  1  pˆ
Read as “q hat”
43
Example: Point Estimate for p
In a survey of 1219 U.S. adults, 354 said that their favorite sport to
watch is football. Find a point estimate for the population
proportion of U.S. adults who say their favorite sport to watch is
football. (Adapted from The Harris Poll)
Solution: n = 1219 and x = 354
pˆ 
x 354

 0.290402  29.0%
n 1219
44
Confidence Intervals for p
A c-confidence interval for the population proportion p
pq
ˆˆ
n
• pˆ  E  p  pˆ  E where E  zc
• The probability that the confidence interval contains p is
c.
45
Constructing Confidence Intervals for p
In Words
In Symbols
1. Identify the sample statistics n and
x.
2. Find the point estimate p̂.
3. Verify that the sampling distribution
of p̂ can be approximated by the
normal distribution.
x
pˆ 
n
npˆ  5, nqˆ  5
Use the Standard
Normal Table
4. Find the critical value zc that
corresponds to the given level of
confidence c.
46
Constructing Confidence Intervals for p
In Words
In Symbols
5. Find the margin of error E.
E  zc
6. Find the left and right
endpoints and form the
confidence interval.
pq
ˆˆ
n
Left endpoint: p̂  E
Right endpoint: p̂  E
Interval: pˆ  E  p  pˆ  E
47
Example: Confidence Interval for p
In a survey of 1219 U.S. adults, 354 said that their favorite sport to
watch is football. Construct a 95% confidence interval for the
proportion of adults in the United States who say that their
favorite sport to watch is football.
Solution: Recall pˆ  0.290402
qˆ  1  pˆ  1  0.290402  0.709598
48
Solution: Confidence Interval for p
Verify the sampling distribution of p̂ can be approximated by the
normal distribution
npˆ  1219  0.290402  354  5
nqˆ  1219  0.709598  865  5
• Margin of error:
E  zc
pq
(0.290402)  (0.709598)
ˆˆ
 1.96
 0.025
n
1219
49
Solution: Confidence Interval for p
Confidence interval:
Left Endpoint:
pˆ  E
 0.29  0.025
Right Endpoint:
pˆ  E
 0.29  0.025
 0.265
 0.315
0.265 < p < 0.315
50
Solution: Confidence Interval for p
0.265 < p < 0.315
0.265
(
p̂  E
Point estimate
0.29
0.315
p̂
p̂  E
•
)
With 95% confidence, you can say that the
proportion of adults who say football is their favorite
sport is between 26.5% and 31.5%.
51
Sample Size
Given a c-confidence level and a margin of error E, the
minimum sample size n needed to estimate p is
2
 zc 
ˆ ˆ 
n  pq
E
This formula assumes you have an estimate for p̂
and qˆ .
If not, use pˆ  0.5 and qˆ  0.5.
52
Example: Sample Size
You are running a political campaign and wish to estimate, with
95% confidence, the proportion of registered voters who will vote
for your candidate. Your estimate must be accurate within 3% of
the true population. Find the minimum sample size needed if
1. no preliminary estimate is available.
Solution:
Because you do not have a preliminary estimate
for p̂ use pˆ  0.5 and qˆ  0.5.
53
Solution: Sample Size
c = 0.95
zc = 1.96
E = 0.03
2
2
 zc 
 1.96 
ˆ ˆ    (0.5)(0.5) 
n  pq
  1067.11
 0.03 
E
Round up to the nearest whole number.
With no preliminary estimate, the minimum sample
size should be at least 1068 voters.
54
Example: Sample Size
You are running a political campaign and wish to
estimate, with 95% confidence, the proportion of
registered voters who will vote for your candidate. Your
estimate must be accurate within 3% of the true
population. Find the minimum sample size needed if
2. a preliminary estimate gives pˆ  0.31.
Solution:
Use the preliminary estimate pˆ  0.31
qˆ  1  pˆ  1  0.31  0.69
55
Solution: Sample Size
c = 0.95
zc = 1.96
E = 0.03
2
2
 zc 
 1.96 
ˆ ˆ    (0.31)(0.69) 
n  pq
  913.02
 0.03 
E
Round up to the nearest whole number.
With a preliminary estimate of pˆ  0.31, the
minimum sample size should be at least 914 voters.
Need a larger sample size if no preliminary estimate
is available.
56
Section 6.4
CONFIDENCE INTERVALS FOR VARIANCE AND STANDARD DEVIATION
57
The Chi-Square Distribution
The point estimate for 2 is s2
The point estimate for  is s
s2 is the most unbiased estimate for 2
Estimate Population
Parameter…
Variance: σ2
Standard deviation:
σ
with Sample
Statistic
s2
s
58
The Chi-Square Distribution
You can use the chi-square distribution to construct a confidence interval for the
variance and standard deviation.
If the random variable x has a normal distribution, then the distribution of
forms a chi-square distribution for samples of any size n > 1.
2 
(n  1)s 2
σ2
59
Properties of The Chi-Square Distribution
1. All chi-square values χ2 are greater than or equal to zero.
2. The chi-square distribution is a family of curves, each determined by
the degrees of freedom. To form a confidence interval for 2, use the
χ2-distribution with degrees of freedom equal to one less than the
sample size.
•
d.f. = n – 1
Degrees of freedom
3. The area under each curve of the chi-square distribution equals one.
60
Properties of The Chi-Square Distribution
4. Chi-square distributions are positively skewed.
chi-square distributions
61
Critical Values for
2
χ
There are two critical values for each level of confidence.
The value χ2R represents the right-tail critical value
The value χ2L represents the left-tail critical value.
1 c
2
c
The area between
the left and right
critical values is c.
1 c
2
 L2
χ2
 R2
62
Example: Finding Critical Values for χ2
2
2


Find the critical values R and L for a 90% confidence
interval when the sample size is 20.
Solution:
• d.f. = n – 1 = 20 – 1 = 19 d.f.
• Each area in the table represents the region under the
chi-square curve to the right of the critical value.
1  c 1  0.90

 0.05
2
2
• Area to the right of
χ2R =
• Area to the right of
1 c
2
χ L= 2
63

1  0.90
 0.95
2
Solution: Finding Critical Values for
2
χ
Table 6: χ2-Distribution
 L2  10.117
 R2  30.144
90% of the area under the curve lies between 10.117 and
30.144
64
Confidence Intervals for
2

and 
Confidence Interval for 2:
2
2
(
n

1)
s
(
n

1)
s
2
•
σ 
2
R
 L2
Confidence Interval for :
•
(n  1)s 2
 R2
σ 
(n  1)s 2
 L2
• The probability that the confidence intervals contain
σ2 or σ is c.
65
Confidence Intervals for
In Words
2

and 
In Symbols
1. Verify that the population has a
normal distribution.
2. Identify the sample statistic n and
the degrees of freedom.
d.f. = n – 1
( x  x )2
s 
n 1
2
3. Find the point estimate s2.
4. Find the critical value χ2R and χ2L
that correspond to the given level
of confidence c.
66
Use Table 6 in
Appendix B
Confidence Intervals for
In Words
2

and 
In Symbols
5. Find the left and right
endpoints and form the
confidence interval for the
population variance.
(n  1)s 2
 R2
6. Find the confidence
interval for the population
standard deviation by
taking the square root of
each endpoint.
(n  1)s 2
 R2
67
2
σ 
σ 
(n  1)s 2
 L2
(n  1)s 2
 L2
Example: Constructing a Confidence
Interval
You randomly select and weigh 30 samples of an allergy medicine.
The sample standard deviation is 1.20 milligrams. Assuming the
weights are normally distributed, construct 99% confidence
intervals for the population variance and standard deviation.
Solution:
• d.f. = n – 1 = 30 – 1 = 29 d.f.
68
Solution: Constructing a Confidence
Interval
1  c 1  0.99

 0.005
R= 2
2
• Area to the right of
χ2
• Area to the right of
1 c
2
χ L= 2
1  0.99

 0.995
2
• The critical values are
χ2R = 52.336 and χ2L = 13.121
69
Solution: Constructing a Confidence
Interval
Confidence Interval for 2:
2
2
(
n

1)
s
(30

1)(1.20)
Left endpoint:

 0.80
2
52.336
R
2
2
(
n

1)
s
(30

1)(1.20)
Right endpoint:

 3.18
2
13.121
L
0.80 < σ2 < 3.18
With 99% confidence you can say that the population
variance is between 0.80 and 3.18 milligrams.
70
Solution: Constructing a Confidence
Interval
Confidence Interval for :
(n  1)s 2
2
R
σ 
(n  1)s 2
2
L
(30  1)(1.20)2
(30  1)(1.20)2
 
52.336
13.121
0.89 < σ < 1.78
With 99% confidence you can say that the population
standard deviation is between 0.89 and1.78 milligrams.
71