Download Week 6 - Seminar

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
Unit 6: Confidence Intervals
Elementary Statistics
Larson
Farber
Ch. 6 Larson/Farber
Definition Review
Ch. 6 Larson/Farber
Big Picture – Confidence Intervals
A group of college students collected data on the speed of vehicles traveling through a
construction zone on a state highway, where the posted speed was 25 mph. Assume
that the standard deviation for the recorded speed of the vehicles is 3.5 mph. The
recorded speed of 14 randomly selected vehicles is as follows:
20, 24, 27, 28, 29, 30, 32, 33, 34, 36, 38, 39, 40, 40
þ
Assuming speeds are approximately normally distributed, how fast do you think the
true mean speed of drivers in this construction zone is?
x  32.14
sx  6.18
n  14
Ch. 6 Larson/Farber
 ?
  3.5
x  32.14
sx  6.18
Ch. 6 Larson/Farber
Construction of a Confidence Interval
þ The construction of a confidence interval
for the population mean depends upon
three factors
– The point estimate of the population
– The level of confidence
– The standard deviation of the sample mean
Ch. 6 Larson/Farber
Section 6.1
Confidence Intervals
for the Mean
(large samples)
Ch. 6 Larson/Farber
Point Estimate
DEFINITION:
A point estimate is a single value
estimate for a population parameter.
The best point estimate of the
population mean
is the sample mean
Ch. 6 Larson/Farber
Part I: Point Estimate
A random sample of 35 airfare prices (in dollars) for a one-way ticket from
Atlanta to Chicago. Find a point estimate for the population mean,
.
99
101
107
102
109
98
105
103
101
105
98
107
104
96
105
95
98
94
100
104
111
114
87
104
108
101
87
103
106
117
94
103
101
The sample mean is
The point estimate for the price of all one way
tickets from Atlanta to Chicago is $101.77.
Ch. 6 Larson/Farber
105
90
Interval Estimates
Point estimate
•
101.77
An interval estimate is an interval or range of
values used to estimate a population parameter.
(
•
101.77
)
The level of confidence, x, is the probability that
the interval estimate contains the population
parameter.
Ch. 6 Larson/Farber
Distribution of Sample Means
When the sample size is at least 30, the sampling
distribution for
is normal.
Sampling distribution of
For c = 0.95
0.025
0.95
-1.96 0 1.96
0.025
z
95% of all sample means will have standard
scores between z = -1.96 and z = 1.96
Ch. 6 Larson/Farber
Ch. 6 Larson/Farber
Solution: Finding the Margin of Error
0.95
0.025
0.025
zc
-zc = -1.96
z=0
zczc= 1.96
z
95% of the area under the standard normal
curve falls within 1.96 standard deviations of the
mean.
Ch. 6 Larson/Farber
Maximum Error of Estimate
The maximum error of estimate E is the greatest
possible distance between the point estimate and
the value of the parameter it is estimating for a
given
level of confidence, c.
When n is greater than 30, the sample standard deviation,
s, can be used for
.
Ch. 6 Larson/Farber
Part 2: Maximum Error of Estimate
A random sample of 35 airfare prices (in dollars) for a one-way ticket from
Atlanta to Chicago.
99
101
107
102
109
98
105
103
101
105
98
107
104
96
105
95
98
94
100
104
111
114
87
104
108
101
87
103
106
117
94
103
101
105
90
Find E, the maximum error of estimate for the one-way plane
fare from Atlanta to Chicago for a 95% level of confidence
given s = 6.69.
Ch. 6 Larson/Farber
Maximum Error of Estimate
s = 6.69
n = 35
Find E, the maximum error
of estimate for the one-way
plane fare from Atlanta to
Chicago for a 95% level of
confidence given s = 6.69.
Using zc = 1.96,
You are 95% confident that the maximum error of estimate is $2.22.
Ch. 6 Larson/Farber
Confidence Intervals for the Population
Mean
A c-confidence interval for the population
mean μ
• x  E    x  E where E  zc 
n
• The probability that the confidence interval
contains μ is c.
Ch. 6 Larson/Farber
Part III:Confidence Intervals for
Find the 95% confidence interval for the one-way plane fare
from Atlanta to Chicago.
You found
Left endpoint
(
99.55
= 101.77 and E = 2.22
Right endpoint
•
101.77
)
103.99
With 95% confidence, you can say the mean one-way fare
from Atlanta to Chicago is between $99.55 and $103.99.
Ch. 6 Larson/Farber
How could we get closer?
(
99.55
Ch. 6 Larson/Farber
•
101.77
)
103.99
Construction of a Confidence Interval
þ The construction of a confidence interval
for the population mean depends upon
three factors
– The point estimate of the population
– The level of confidence
– The standard deviation of the sample mean
Ch. 6 Larson/Farber
How could we get closer?
(
99.55
•
101.77
)
103.99
Two ways to get a smaller Confidence Interval:
• Lower confidence level (e.g. 75%)
• Bigger Sample
Ch. 6 Larson/Farber
Sample Size
Given a c-confidence level and an
maximum error of estimate, E, the minimum
sample size n, needed to estimate , the
population mean is
Ch. 6 Larson/Farber
Part IV: Sample Size
You want to estimate the mean one-way fare from Atlanta to
Chicago. How many fares must be included in your sample if
you want to be 95% confident that the sample mean is
within $2 of the population mean?
You should include at least 43 fares in your sample. Since
you already have 35, you need 8 more.
Ch. 6 Larson/Farber
Section 6.2
What happens if we don’t
have 30 observations?
Confidence Intervals
for the Mean
(small samples)
Ch. 6 Larson/Farber
Normal or t-Distribution?
Is n  30?
Yes
No
Is the population normally,
or approximately normally,
distributed?
No
Cannot use the normal
distribution or the t-distribution.
Yes
Use the normal distribution
with E  z σ
Yes
Is  known?
No
c
Use the t-distribution with
E  tc
s
n
and n – 1 degrees of freedom.
Ch. 6 Larson/Farber
Use the normal distribution with
σ
E  zc
n
If  is unknown, use s instead.
n
þ Comparing three curves
– The standard normal curve
– The t curve with 14 degrees of freedom
– The t curve with 4 degrees of freedom
Ch. 6 Larson/Farber
The t-Distribution
If the distribution of a random variable x is normal
and n < 30, then the sampling distribution of
is a
t-distribution with n – 1 degrees of freedom.
Sampling distribution
n = 13
d.f. = 12
c = 90%
.90
.05
-1.782
.05
0
t
1.782
The critical value for t is 1.782. 90% of the sample means
(n = 13) will lie between t = -1.782 and t = 1.782.
Ch. 6 Larson/Farber
Confidence Interval–Small Sample
Maximum error of estimate
In a random sample of 13 American adults, the mean
waste recycled per person per day was 4.3 pounds and
the standard deviation was 0.3 pound. Assume the
variable is normally distributed and construct a 90%
confidence interval for .
1. The point estimate is
= 4.3 pounds
2. The maximum error of estimate is
Ch. 6 Larson/Farber
Finding tc
If c = 0.90
n = 13 (df =12)
tc = ?
d.f. = n - 1
Ch. 6 Larson/Farber
http://surfstat.anu.edu.au/surfstat-home/tables/t.php
Ch. 6 Larson/Farber
Confidence Interval–Small Sample
Maximum error of estimate
In a random sample of 13 American adults, the mean
waste recycled per person per day was 4.3 pounds and
the standard deviation was 0.3 pound. Assume the
variable is normally distributed and construct a 90%
confidence interval for .
1. The point estimate is
= 4.3 pounds
2. The maximum error of estimate is
Ch. 6 Larson/Farber
Confidence Interval–Small Sample
1. The point estimate is
= 4.3 pounds
2. The maximum error of estimate is
Left endpoint
Right endpoint
)
(
•
4.3
4.152 4.15 < < 4.45 4.448
With 90% confidence, you can say the mean waste
recycled per person per day is between 4.15 and 4.45
pounds.
Ch. 6 Larson/Farber
Normal or t-Distribution?
Is n  30?
Yes
No
Is the population normally,
or approximately normally,
distributed?
Use the normal distribution with
σ
E  zc
n
If  is unknown, use s instead.
No
Cannot use the normal
distribution or the t-distribution.
Yes
Use the normal distribution
with E  z σ
Yes
Is  known?
No
c
n
Use the t-distribution with
E  tc
s
n
and n – 1 degrees of freedom.
Ch. 6 Larson/Farber
See Pg 329 of textbook
1. The Graduate Management Admission Test (GMAT) is a test required for
admission into many masters of business administration (MBA) programs.
Total scores on the GMAT are normally distributed and historically have a
standard deviation of 113. Suppose a random sample of 8 students took
the test, and their scores are recorded.
2. Sean is estimating the average number of Christmas Trees he will find in
the windows of each store in the mall. He observes each of the 10 stores
in the mall and records a sample mean of 15 trees with a standard
deviation of 6.
3. Patrick wonders about the average number of servings of eggnog at the
Holiday Party. He knows that typically this variable has a standard
deviation of 2.2 servings. He records a sample mean of 4 servings for a
sample of 50 people.
1. The Graduate Management Admission Test (GMAT) is a test required for
admission into many masters of business administration (MBA)
programs. Total scores on the GMAT are normally distributed and
historically have a standard deviation of 113. Suppose a random sample
of 8 students took the test, and their scores are recorded. (We know
population is normally distributed, so we can use Z even though n <30)
2. Sean is estimating the average number of Christmas Trees he will find in
the windows of each store in the mall. He observes each of the 10 stores
in the mall and records a sample mean of 15 trees with a standard
deviation of 6. (We do not know population is normally distributed, so
must use t with 9 degrees of freedom because we have a small sample
and we do not know sigma)
3. Patrick wonders about the average number of servings of eggnog at the
Holiday Party. He knows that typically this variable has a standard
deviation of 2.2 servings. He records a sample mean of 4 servings for a
sample of 50 people. (We do not know population is normally distributed,
but we know sigma is 2.2 so we can use Z.)
Section 6.3
Confidence Intervals
for Population
Proportions
Ch. 6 Larson/Farber
What if we are interested in a
population proportion or percentage?
For example: What percentage of the population
likes spinach?
Ch. 6 Larson/Farber
Confidence Intervals for
Population Proportions
The point estimate for p, the population proportion of
successes, is given by the proportion of successes
in a sample
(Read as p-hat)
is the point estimate for the proportion of failures where
Required Condition: If np >= 5 and nq >=5 the
sampling distribution for p-hat is normal.
Ch. 6 Larson/Farber
Confidence Intervals for Population
Proportions
The maximum error of estimate, E, for a x-confidence
interval is:
A c-confidence interval for the population
proportion, p, is
Ch. 6 Larson/Farber
Confidence Interval for p
In a study of 1907 fatal traffic accidents, 449 were
alcohol related. Construct a 99% confidence interval
for the proportion of fatal traffic accidents that are
alcohol related.
Ch. 6 Larson/Farber
Confidence Interval for p
In a study of 1907 fatal traffic accidents, 449 were
alcohol related. Construct a 99% confidence interval
for the proportion of fatal traffic accidents that are
alcohol related.
1. The point estimate for p is
2. 1907(.235) > 5 and 1907(.765) > 5, so the
sampling distribution is normal.
3.
Ch. 6 Larson/Farber
Confidence Interval for p
In a study of 1907 fatal traffic accidents, 449 were alcohol
related. Construct a 99% confidence interval for the proportion
of fatal traffic accidents that are alcohol related.
Left endpoint
Right endpoint
(
.21
•
.235
)
.26
0.21 < p < 0.26
With 99% confidence, you can say the
proportion of fatal accidents that are
alcohol related is between 21% and 26%.
Ch. 6 Larson/Farber
Minimum Sample Size
If you have a preliminary estimate for p and q, the
minimum sample size given a x-confidence
interval and a maximum error of estimate needed
to estimate p is:
If you do not have a preliminary estimate, use
0.5 for both
.
Ch. 6 Larson/Farber
Example–Minimum Sample Size
You wish to estimate the proportion of fatal accidents
that are alcohol related at a 99% level of confidence.
Find the minimum sample size needed to be be
accurate to within 2% of the population proportion.
With no preliminary estimate use 0.5 for
You will need at least 4415 for your sample.
Ch. 6 Larson/Farber
Example–Minimum Sample Size
You wish to estimate the proportion of fatal accidents
that are alcohol related at a 99% level of confidence.
Find the minimum sample size needed to be be
accurate to within 2% of the population proportion.
Use a preliminary estimate of p = 0.235.
With a preliminary sample you need at least
n = 2981 for your sample.
Ch. 6 Larson/Farber
Ch. 6 Larson/Farber
Example #1 (pg 310)
Market researchers use the number of sentences per
advertisement as a measure of readability for magazine
advertisements. Suppose for the 50 advertisements we
determine that the average number of sentences (xbar)
is 12.4 and the standard deviation is 5.0:
Compute the 95% confidence interval for the mean mu.
– Question #1: Do we know the population standard deviation?
– Question #2: What is the interval?
Ch. 6 Larson/Farber
Ch. 6 Larson/Farber
Solution
xbar = 12.4
s = 5.0
n = 50
c = 0.95
Zc = 1.96 (for c = 0.95)
E = 1.96 * 5 / √50 = 1.4
12.4 – 1.4 < μ < 12.4 + 1.4
11.0
< μ < 13.8
Answer: We are 95% confident that the mean number of
sentences in the POPULATION is between 11.0 and 13.8.
Ch. 6 Larson/Farber
Example #2 (Pg. 327)
You randomly select 16 coffee shops and measure the
temperature of the coffee sold at each. The sample mean
temperature is 162 degrees with a standard deviation of 10
degrees. You know that the distribution of temperature is
normally distributed.
A. Find the 95% confidence interval for the mean temperature.
B. Find the 99% confidence interval for the mean temperature.
Ch. 6 Larson/Farber
Ch. 6 Larson/Farber
95% Confidence Interval
xbar = 162.0
s = 10.0
n = 16
c = 0.95
tc = 2.132 (df = 15, c = .95)
E = 2.132 * 10 / √16 = 5.3
162 – 5.3 < μ < 162 + 5.3
156.7
< μ < 167.3
Answer: We are 95% confident that the average
temperature of all the coffee in the POPULATION is
between 157 and 167 degrees.
Ch. 6 Larson/Farber
What about 99%?
xbar = 162.0
s = 10.0
n = 16
c = 0.99
tc = 2.947 (df = 15, c = .99)
E = 2.947 * 10 / √16 = 7.4
162 – 7.4 < μ < 162 + 7.4
154.6
< μ < 169.4
Answer: We are 95% confident that the average
temperature of all the coffee in the POPULATION is
between 154.6 and 169.4 degrees.
Ch. 6 Larson/Farber