Download Document

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
Unit 6
Confidence Intervals
If you arrive late (or leave early) please do not
announce it to everyone as we get side
tracked, instead send me an email.
1
Point Estimate for Population μ
Point Estimate
• A single value estimate for a population parameter
• Most unbiased point estimate of the population mean
μ is the sample mean x
Estimate Population with Sample
Parameter…
Statistic
x
Mean: μ
2
Example: Point Estimate for Population μ
Market researchers use the number of sentences per
advertisement as a measure of readability for magazine
advertisements. The following represents a random
sample of the number of sentences found in 50
advertisements. Find a point estimate of the
population mean, . (Source: Journal of Advertising
Research)
9 20 18 16 9 9 11 13 22 16 5 18 6 6 5 12 25
17 23 7 10 9 10 10 5 11 18 18 9 9 17 13 11 7
14 6 11 12 11 6 12 14 11 9 18 12 12 17 11 20
3
Solution: Point Estimate for Population μ
The sample mean of the data is
x 620
x

 12.4
n
50
Your point estimate for the mean length of all magazine
advertisements is 12.4 sentences.
4
Interval Estimate
Interval estimate
• An interval, or range of values, used to estimate a
population parameter.
Point estimate
(
12.4
•
)
Interval estimate
How confident do we want to be that the interval estimate
contains the population mean μ?
5
Level of Confidence
Level of confidence c
• The probability that the interval estimate contains the
population parameter.
c is the area under the
c
standard normal curve
between the critical values.
½(1 – c)
½(1 – c)
-zc
z=0
Critical values
z
zc
Use the Standard
Normal Table to find the
corresponding z-scores.
The remaining area in the tails is 1 – c .
6
Level of Confidence
• If the level of confidence is 90%, this means that we
are 90% confident that the interval contains the
population mean μ.
c = 0.90
½(1 – c) = 0.05
½(1 – c) = 0.05
zc
-zc = -1.645
z=0
zc =zc1.645
z
The corresponding z-scores are +1.645.
7
Sampling Error
Sampling error
• The difference between the point estimate and the
actual population parameter value.
• For μ:
 the sampling error is the difference x – μ
 μ is generally unknown
 x varies from sample to sample
8
Margin of Error
Margin of error
• The greatest possible distance between the point
estimate and the value of the parameter it is
estimating for a given level of confidence, c.
• Denoted by E.
E  zcσ x  zc
σ
n
When n  30, the sample
standard deviation, s, can
be used for .
• Sometimes called the maximum error of estimate or
error tolerance.
9
Example: Finding the Margin of Error
Use the magazine advertisement data and a 95%
confidence level to find the margin of error for the
mean number of sentences in all magazine
advertisements. Assume the sample standard deviation
is about 5.0.
10
Solution: Finding the Margin of Error
• First find the critical values
0.95
0.025
0.025
zc
-zc = -1.96
z=0
zczc= 1.96
z
95% of the area under the standard normal curve falls
within 1.96 standard deviations of the mean. (You
can approximate the distribution of the sample means
with a normal curve by the Central Limit Theorem,
because n ≥ 30.)
11
Solution: Finding the Margin of Error
E  zc

n
 1.96 
 zc
s
n
You don’t know σ, but
since n ≥ 30, you can
use s in place of σ.
5.0
50
 1.4
You are 95% confident that the margin of error for the
population mean is about 1.4 sentences.
12
Confidence Intervals for the Population
Mean
A c-confidence interval for the population mean μ
• x E   x E
where E  zc

n
• The probability that the confidence interval contains μ
is c.
13
Constructing Confidence Intervals for μ
Finding a Confidence Interval for a Population Mean
(n  30 or σ known with a normally distributed population)
In Words
1. Find the sample statistics n and
x.
2. Specify , if known.
Otherwise, if n  30, find the
sample standard deviation s and
use it as an estimate for .
In Symbols
x
x
n
(x  x )2
s
n 1
14
Constructing Confidence Intervals for μ
In Words
3. Find the critical value zc that
corresponds to the given
level of confidence.
4. Find the margin of error E.
5. Find the left and right
endpoints and form the
confidence interval.
In Symbols
Use the Standard
Normal Table.
E  zc

n
Left endpoint: x  E
Right endpoint: x  E
Interval:
xE  xE
15
Example: Constructing a Confidence
Interval
Construct a 95% confidence interval for the mean
number of sentences in all magazine advertisements.
Solution: Recall x  12.4 and E = 1.4
Left Endpoint:
xE
 12.4  1.4
 11.0
Right Endpoint:
xE
 12.4  1.4
 13.8
11.0 < μ < 13.8
16
Solution: Constructing a Confidence
Interval
11.0 < μ < 13.8
11.0
(
12.4 13.8
•
)
With 95% confidence, you can say that the population
mean number of sentences is between 11.0 and 13.8.
17
Example: Constructing a Confidence
Interval σ Known
A college admissions director wishes to estimate the
mean age of all students currently enrolled. In a random
sample of 20 students, the mean age is found to be 22.9
years. From past studies, the standard deviation is
known to be 1.5 years, and the population is normally
distributed. Construct a 90% confidence interval of the
population mean age.
18
Solution: Constructing a Confidence
Interval σ Known
• First find the critical values
c = 0.90
½(1 – c) = 0.05
½(1 – c) = 0.05
zc
-zc = -1.645
z=0
zc =zc1.645
z
zc = 1.645
19
Solution: Constructing a Confidence
Interval σ Known
• Margin of error:

E  zc
n
 1.645 
• Confidence interval:
Left Endpoint:
xE
 22.9  0.6
 22.3
1.5
20
 0.6
Right Endpoint:
xE
 22.9  0.6
 23.5
22.3 < μ < 23.5
20
Solution: Constructing a Confidence
Interval σ Known
22.3 < μ < 23.5
Point estimate
22.3
(
x E
22.9
23.5
x
xE
•
)
With 90% confidence, you can say that the mean age
of all the students is between 22.3 and 23.5 years.
21
Interpreting the Results
• μ is a fixed number. It is either in the confidence
interval or not.
• Incorrect: “There is a 90% probability that the actual
mean is in the interval (22.3, 23.5).”
• Correct: “If a large number of samples is collected
and a confidence interval is created for each sample,
approximately 90% of these intervals will contain μ.
22
Sample Size
• Given a c-confidence level and a margin of error E,
the minimum sample size n needed to estimate the
population mean  is
 zc 
n

E


• If  is unknown, you can estimate it using s provided
you have a preliminary sample with at least 30
members.
2
23
Example: Sample Size
You want to estimate the mean number of sentences in a
magazine advertisement. How many magazine
advertisements must be included in the sample if you
want to be 95% confident that the sample mean is
within one sentence of the population mean? Assume
the sample standard deviation is about 5.0.
24
Solution: Sample Size
• First find the critical values
0.95
0.025
0.025
zc
-zc = -1.96
z=0
zczc= 1.96
z
zc = 1.96
25
Solution: Sample Size
zc = 1.96
  s = 5.0
E=1
 zc   1.96  5.0 
n

  96.04

1

 E  
2
2
When necessary, round up to obtain a whole number.
You should include at least 97 magazine advertisements
in your sample.
26
Section 6.2 Objectives
• Interpret the t-distribution and use a t-distribution
table
• Construct confidence intervals when n < 30, the
population is normally distributed, and σ is unknown
27
The t-Distribution
• When the population standard deviation is unknown,
the sample size is less than 30, and the random
variable x is approximately normally distributed, it
follows a t-distribution.
x -
t
s
n
• Critical values of t are denoted by tc.
28
Properties of the t-Distribution
1. The t-distribution is bell shaped and symmetric
about the mean.
2. The t-distribution is a family of curves, each
determined by a parameter called the degrees of
freedom. The degrees of freedom are the number
of free choices left after a sample statistic such as x
is calculated. When you use a t-distribution to
estimate a population mean, the degrees of freedom
are equal to one less than the sample size.
 d.f. = n – 1
Degrees of freedom
29
Properties of the t-Distribution
3. The total area under a t-curve is 1 or 100%.
4. The mean, median, and mode of the t-distribution are
equal to zero.
5. As the degrees of freedom increase, the t-distribution
approaches the normal distribution. After 30 d.f., the tdistribution is very close to the standard normal zdistribution.
The tails in the tdistribution are “thicker”
than those in the standard
normal distribution.
d.f. = 2
d.f. = 5
Standard normal curve
t
0
30
Example: Critical Values of t
Find the critical value tc for a 95% confidence when the
sample size is 15.
Solution: d.f. = n – 1 = 15 – 1 = 14
Table 5: t-Distribution
tc = 2.145
31
Solution: Critical Values of t
95% of the area under the t-distribution curve with 14
degrees of freedom lies between t = +2.145.
c = 0.95
t
-tc = -2.145
tc = 2.145
32
Confidence Intervals for the Population
Mean
A c-confidence interval for the population mean μ
s
• x  E    x  E where E  tc
n
• The probability that the confidence interval contains μ
is c.
33
Confidence Intervals and t-Distributions
In Words
1. Identify the sample
statistics n, x , and s.
2. Identify the degrees of
freedom, the level of
confidence c, and the
critical value tc.
3. Find the margin of error E.
In Symbols
x
(x  x )2
x
s
n 1
n
d.f. = n – 1
E  tc
s
n
34
Confidence Intervals and t-Distributions
In Words
4. Find the left and right
endpoints and form the
confidence interval.
In Symbols
Left endpoint: x  E
Right endpoint: x  E
Interval:
xE  xE
35
Example: Constructing a Confidence
Interval
You randomly select 16 coffee shops and measure the
temperature of the coffee sold at each. The sample mean
temperature is 162.0ºF with a sample standard deviation
of 10.0ºF. Find the 95% confidence interval for the
mean temperature. Assume the temperatures are
approximately normally distributed.
Solution:
Use the t-distribution (n < 30, σ is unknown,
temperatures are approximately distributed.)
36
Solution: Constructing a Confidence
Interval
• n =16, x = 162.0 s = 10.0 c = 0.95
• df = n – 1 = 16 – 1 = 15
• Critical Value Table 5: t-Distribution
tc = 2.131
37
Solution: Constructing a Confidence
Interval
• Margin of error:
s
10
E  tc
 2.131
 5.3
n
16
• Confidence interval:
Left Endpoint:
xE
 162  5.3
 156.7
Right Endpoint:
xE
 162  5.3
 167.3
156.7 < μ < 167.3
38
Solution: Constructing a Confidence
Interval
• 156.7 < μ < 167.3
Point estimate
156.7
(
x E
162.0
•x
167.3
)
xE
With 95% confidence, you can say that the mean
temperature of coffee sold is between 156.7ºF and
167.3ºF.
39
Normal or t-Distribution?
Is n  30?
Yes
No
Is the population normally,
or approximately normally,
distributed?
Use the normal distribution with
σ
E  zc
n
If  is unknown, use s instead.
No
Cannot use the normal
distribution or the t-distribution.
Yes
Use the normal distribution
with E  z σ
Yes
Is  known?
No
c
n
Use the t-distribution with
E  tc
s
n
and n – 1 degrees of freedom.
40
Section 6.3
Point Estimate for Population p
Population Proportion
• The probability of success in a single trial of a
binomial experiment.
• Denoted by p
Point Estimate for p
• The proportion of successes in a sample.
• Denoted by
x number of successes in sample
 pˆ  n 
number in sample
 read as “p hat”
41
Point Estimate for Population p
Estimate Population with Sample
Parameter…
Statistic
Proportion: p
p̂
Point Estimate for q, the proportion of failures
• Denoted by qˆ  1  pˆ
• Read as “q hat”
42
Example: Point Estimate for p
In a survey of 1219 U.S. adults, 354 said that their
favorite sport to watch is football. Find a point estimate
for the population proportion of U.S. adults who say
their favorite sport to watch is football. (Adapted from The
Harris Poll)
Solution: n = 1219 and x = 354
x 354
pˆ  
 0.290402  29.0%
n 1219
43
Confidence Intervals for p
A c-confidence interval for the population proportion p
•
pˆ  E  p  pˆ  E
where E  zc
pq
ˆˆ
n
• The probability that the confidence interval contains p is
c.
44
Constructing Confidence Intervals for p
In Words
In Symbols
1. Identify the sample statistics n
and x.
2. Find the point estimate p̂.
3. Verify that the sampling
distribution of p̂ can be
approximated by the normal
distribution.
4. Find the critical value zc that
corresponds to the given level of
confidence c.
pˆ 
x
n
npˆ  5, nqˆ  5
Use the Standard
Normal Table
45
Constructing Confidence Intervals for p
In Words
5. Find the margin of error E.
6. Find the left and right
endpoints and form the
confidence interval.
In Symbols
E  zc
pq
ˆˆ
n
Left endpoint: p̂  E
Right endpoint: p̂  E
Interval:
pˆ  E  p  pˆ  E
46
Example: Confidence Interval for p
In a survey of 1219 U.S. adults, 354 said that their
favorite sport to watch is football. Construct a 95%
confidence interval for the proportion of adults in the
United States who say that their favorite sport to watch
is football.
Solution: Recall pˆ  0.290402
qˆ  1  pˆ  1  0.290402  0.709598
47
Solution: Confidence Interval for p
• Verify the sampling distribution of p̂ can be
approximated by the normal distribution
npˆ  1219  0.290402  354  5
nqˆ  1219  0.709598  865  5
• Margin of error:
E  zc
pq
(0.290402)  (0.709598)
ˆˆ
 1.96
 0.025
n
1219
48
Solution: Confidence Interval for p
• Confidence interval:
Left Endpoint:
pˆ  E
 0.29  0.025
 0.265
Right Endpoint:
pˆ  E
 0.29  0.025
 0.315
0.265 < p < 0.315
49
Solution: Confidence Interval for p
• 0.265 < p < 0.315
Point estimate
0.265
(
p̂  E
0.29
0.315
p̂
p̂  E
•
)
With 95% confidence, you can say that the proportion
of adults who say football is their favorite sport is
between 26.5% and 31.5%.
50
Sample Size
• Given a c-confidence level and a margin of error E,
the minimum sample size n needed to estimate p is
2
 zc 
ˆ ˆ 
n  pq
E
• This formula assumes you have an estimate for p̂
and qˆ .
• If not, use pˆ  0.5 and qˆ  0.5.
51
Example: Sample Size
You are running a political campaign and wish to
estimate, with 95% confidence, the proportion of
registered voters who will vote for your candidate. Your
estimate must be accurate within 3% of the true
population. Find the minimum sample size needed if
1. no preliminary estimate is available.
Solution:
Because you do not have a preliminary estimate
for p̂ use pˆ  0.5 and qˆ  0.5.
52
Solution: Sample Size
• c = 0.95
zc = 1.96
2
E = 0.03
2
 zc 
 1.96 
ˆ ˆ    (0.5)(0.5) 
n  pq
  1067.11
 0.03 
E
Round up to the nearest whole number.
With no preliminary estimate, the minimum sample
size should be at least 1068 voters.
53
Example: Sample Size
You are running a political campaign and wish to
estimate, with 95% confidence, the proportion of
registered voters who will vote for your candidate. Your
estimate must be accurate within 3% of the true
population. Find the minimum sample size needed if
2. a preliminary estimate gives pˆ  0.31 .
Solution:
Use the preliminary estimate pˆ  0.31
qˆ  1  pˆ  1  0.31  0.69
54
Solution: Sample Size
• c = 0.95
zc = 1.96
2
E = 0.03
2
 zc 
 1.96 
ˆ ˆ    (0.31)(0.69) 
n  pq
  913.02
 0.03 
E
Round up to the nearest whole number.
With a preliminary estimate of pˆ  0.31, the
minimum sample size should be at least 914 voters.
Need a larger sample size if no preliminary estimate
is available.
55
• End, any questions?
56