Download Math 227_Sullivan 4th ed Ans Key

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Degrees of freedom (statistics) wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Resampling (statistics) wikipedia , lookup

German tank problem wikipedia , lookup

Student's t-test wikipedia , lookup

Misuse of statistics wikipedia , lookup

Transcript
Chapter 9
Estimating the Value of a Parameter
Chapter 9.1 Estimating a Population Proportion
Objective A :Point Estimate
A point estimate is the value of a statistic that estimates the value of a parameter.
x
The best point estimate of the population proportion is a sample proportion ( pˆ  ).
n
 x ).
The best point estimate of the population mean is a sample mean ( x 
n
Since p̂ varies from sample to sample, we use an interval based on p̂ to capture the unknown
population proportion with a level of confidence.
Objective B : Confidence Interval
A confidence interval for an unknown parameter consists of an interval of numbers based on
a point estimate.
The level of confidence represents the expected proportion of intervals that will contain the
parameter if a large number of different samples is obtained.
The level of confidence is denoted as 1    100% . The level of confidence controls the width
of the interval.
Confidence interval estimates for a parameter are of the form:
Point estimate  margin of error.
Confidence interval for p :
pˆ  Z  / 2   pˆ where  pˆ 
pˆ (1  pˆ )
provided that n pˆ (1  pˆ )  10 .
n
The value of Z / 2 is called the critical value of the distribution.
The margin of error, E , in a 1    100% confidence interval for a population proportion is
given by E  Z  / 2
pˆ (1  pˆ )
. The width of the interval is determined by the margin of error.
n
Example 1:Use StatCrunch to determine the critical value Z / 2 that corresponds to the given level
of confidence.
(a) 90%
(b) 95%
(c) 98%
(d) 92%
(a) 90% level of confidence represents the middle area under the standard normal distribution.
1
Open StatCrunch → Select Stat → Calculator → Normal → Between, µ=0,  = 1 →
Input P (____≤X ≤ _____) = 0.90→ Compute
𝑍𝛼/2 = 1.645
(b) 95% level of confidence
Open StatCrunch → Select Stat → Calculator → Normal → Between, µ=0,  = 1 →
Input P (____≤ X ≤ _____) = 0.95→ Compute
𝑍𝛼/2 = 1.96
(c) 98% level of confidence
Similarly --> Open StatCrunch → Select Stat → Calculator → Normal → Between, µ=0,  = 1
→ Input P (____≤ X ≤ _____) = 0.98→ Compute
2
𝑍𝛼/2 = 2.33
(d) 92% level of confidence
Open StatCrunch → Select Stat → Calculator → Normal → Between→
Input P (____≤ X ≤ _____) = 0.92→ Compute
𝑍𝛼/2 = 1.75
Example 2: Determine the margin of error for p with x  540 and n  900 at a 99% level of confidence.
𝑝̂ =
𝑥
𝑛
=
540
= 0.60
900
𝑝̂(1−𝑝̂)
𝐸 = 𝑍𝛼/2 *√
𝑛
Open StatCrunch → Select Stat → Calculator → Normal → Between →
Input P (____≤ X ≤ _____) = 0.99 → Compute
𝑍𝛼/2 = 2.58
E = 2.58 *
√(0.60)(0.40)
= 0.0421
900
Example 3: A Rasmussen Reports national survey of 1000 adult Americans found that 18% dreaded
Valentine's Day. Construct a 95% confidence interval for the population proportion of adult
Americans who dread Valentine's Day. Explain what does the interval mean?
n = 1000
p = 0.18
95% confidence
3
Method I --> Use the C. I. formula --> 𝑝̂ ± 𝐸
𝑝̂(1−𝑝̂)
𝐸 = 𝑍𝛼/2 *√
𝑛
Open StatCrunch → Select Stat → Calculator → Normal → Between→
Input P (____≤ X ≤ _____) = 0.95 → Compute
𝑍𝛼/2 = 1.96
𝑍𝛼/2 *
√
(0.18)(0.82)
1000
= 1.96*
√
0.1476
1000
= ≈ 0.023716
The confidence interval is 𝑝̂ ± 𝐸 = 0.18 ± 0.0237 = (0.1563, 0.2037).
This interval means that we are 95% confident that 15.63% to 20.37% of adult Americans
dreaded Valentine’s Day.
Method II --> Use StatCrunch to find the C. I.
Open StatCrunch → Stat → Proportion Stats → One Sample → with summary → Input number
of successes --> 180 (18% of 1000) and observations --> 1000, select confidence interval and
input 95% → compute and record results
Example 4: Construct a confidence interval of the population proportion at the given level of confidence.
x  80, n  200, 96% confidence
𝑥 80
𝑝̂ = =
= 0.4
𝑛 200
Method I--> Use the C. I. formula --> 𝑝̂ ± 𝐸
Open StatCrunch → Select Stat → Calculator → Normal → Between →
Input P (___≤X ≤ _____) = 0.96 → Compute
𝑍𝛼/2 ≈ 2.05.
4
𝑝̂(1−𝑝̂)
𝐸 = 𝑍𝛼/2 *√
𝑍𝛼/2 *
√
𝑛
(0.4)(0.6)
= 2.05*
√
0.24
=
= 2.05*0.0346 = 0.071
200
200
The 96% confidence interval is 𝑝̂ ± 𝐸 = 0.40 ± 0.071 = (0.329, 0.471).
Method II --> Use StatCrunch to find the C. I.
Open StatCrunch → Stat → Proportion Stats → One Sample → with summary → Input number
of successes --> 80 and observations --> 200, select confidence interval and input 96% →
compute and record results
Example 5: In a study of 1228 randomly selected medical malpractice lawsuits, it is found that 856
of them were later dropped or dismissed.
(a) What is the best point of estimate of the proportion of medical malpractice lawsuits that
are dropped or dismissed?
The best point estimator for 𝑝 is 𝑝̂ .
856
𝑝̂ = 1228 ≈ 0.6971
(b) Use StatCrunch to construct a 99% confidence interval for the population proportion of
medical malpractice lawsuits that are dropped or dismissed?
Open StatCrunch → Stat → Proportion Stats → One Sample → with summary → Input number
of successes --> 856 and observations --> 1228, select confidence interval and input 99% →
compute and record results
Proportion Count Total Sample Prop. Std. Err.
L. Limit
U. Limit
p
856 1228 0.6970684 0.013113264 0.66329087 0.73084593
(c) Interpret the interval.
The 99% level of confidence interval for p is between 0.6633 and 0.7308. We are 99% confident
that the percentage of malpractice lawsuits that are dropped or dismissed is between 66.33% and
73.08%.
5
Objective C : Sample Size Needed for Estimating the Population Proportion p
The sample size required to obtain a 1    100% confidence interval for p with a
margin of error E is given by
Z

n  pˆ (1  pˆ )  / 2 
 E 
2
Round up to the next integer
p̂ is a prior estimate of p
If a prior estimate of p is unavailable, the sample size required is
2
Z 
n  0.25   /2 
Round up to the next integer
 E 
Example 1 : An urban economist wishes to estimate the proportion of Americans who own
their homes. What size sample should be obtained if he wishes the estimate to be
within 0.02 with 90% confidence if
(a) he uses a 2010 estimate of 0.669 obtained from the U.S Census Bureau?
𝑝̂ = 0.669 E = 0.02 90% confidence
Use StatCrunch to find 𝑍𝛼/2 for a 90% level of confidence → 𝑍𝛼/2= 1.645
2
1.645 2
Z

n  pˆ (1  pˆ )  / 2  = ( 0.669)(1-0.669)
0.02
 E 
= (0.669) (0.331)
1.645 2
0.02
≈ 1498.4867
Round up to the next whole number -->1499.
(b) he does not use any prior estimates?
n = 0.25
= 0.25
𝑍𝛼/2 2
𝐸
1.645 2
0.02
= 1691.265625
Round up to the next whole number1692.
6
Example 2: In a Gallup poll conducted in October 2010, 64% of the people polled answered
"more strict" to the following question: "Do you feel that the laws covering the sale
of firearms should be made more strict as they are now?" Suppose the margin of
error in the poll was 3.5% and the estimate was made with 95% confidence. At
least how many people were surveyed?
pˆ  0.64
E = 0.035
95% confidence
95% level of confidence → 𝑍𝛼/2 = 1.96
Z

n  pˆ (1  pˆ )  / 2 
 E 
= (0.64)(1-0.64)
= (0.64)(0.36)
2
1.96 2
0.035
1.96 2
0.035
= 722.5344
Round up to 723 people.
Example 3: A Gallup poll conducted in November 2010 found that 493 of 1050 adult Americans
believe it is the responsibility of the federal government to make sure all Americans
have healthcare coverage.
(a) Obtain a point estimate for the proportion of adult Americans who believe it is the
responsibility of the federal government to make sure all Americans have health care coverage.
The best point estimate for p is 𝑝̂ .
pˆ 
493
 0.470
1050
(b) Verify the requirements for constructing a confidence interval for p are satisfied.
npˆ (1  pˆ )  10
(1050)(0.470)(1-0.470) ≥ 10
(1050)(0.470)(0.530) ≥ 10
≈ 262 ≥ 10
Yes, 𝑝̂ is normally distributed.
7
(c) Use StatCrunch to construct a 95% confidence interval for the proportion of adult
Americans who believe it is the responsibility of the federal government to make sure
all Americans healthcare coverage. Interpret the interval.
Open StatCrunch → Stat → Proportion Stats → One Sample → with summary →Input number
of successes - 493 and observations – 1050, select confidence interval and input 95% → compute
and record results
Proportion
p
Count Total Sample Prop.
493
1050
0.46952381
Std. Err.
L. Limit
U. Limit
0.015401645 0.43933714 0.49971048
The 95% level of confidence interval for p is between 0.4393 and 0.4997. We are 95% confident
that the percentage of Americans that believe it is the responsibility of the federal government to
make sure that all Americans have healthcare coverage is between 43.93% and 49.97% .
(d) You wish to conduct your own study for the proportion of adult Americans who believe it is
the responsibility of the federal government to make sure all Americans have healthcare
coverage. What sample size would be needed for the estimate to be within 3 percentage
points with 90% confidence if you use the estimate obtained in part (a).
Use StatCrunch to estimate 𝑛 for 𝑝.
Open StatCrunch → Stat → Proportion Stats → One Sample → Power/Sample size 
495
Confidence Interval Input confidence level: 0.90, Target Proportion: 0.470 ( 𝑝̂ = 1050 ≈
0.470) and Width: .06 (𝐸 = 0.03., Total Width = 2𝐸)  compute and record results.
The sample size needed to be within 3 percentage points is 749.
(e) You wish to conduct your own study for the proportion of adult Americans who believe it
is the responsibility of the federal government to make sure all Americans have healthcare
coverage. What sample size would be needed for the estimate to be within 3 percentage
points with 90% confidence if you do not have a prior estimate?
8
Open StatCrunch → Stat → Proportion Stats → One Sample → Power/Sample Size →
Confidence Interval→ Input confidence level: 0.90, Target Proportion: 0.50 (no prior estimate
of 𝑝, we use 𝑝 = 0.5) and Width: 0.06 (𝐸 = 0.03., Total Width = 2𝐸) → compute and record
results.
The sample size needed to be within 3 percentage points is 752.
Chapter 9.2 Estimating a Population Mean
Objective A :Point Estimate
The best point estimate of the population mean,  , is the sample mean, x .
Objective B :Student's t - distribution
Properties of the t - distribution
1. The t - distribution is different for different degrees of freedom ( df  n  1 ).
2. The t - distribution has the same general symmetric bell shape as the standard normal
distribution but its area in the tails is a little greater than the area in the tails of the standard
normal distribution due to the greater variability that is expected with small samples.
3. The t - distribution has a mean of t  0 at the center of the distribution.
4. As the sample size n gets larger, the t - distribution gets closer to the standard normal
9
distribution.
Example 1: Use StatCrunch to determine the t -value.
(a) Find the t -value such that the area in the right tail is 0.05 with 19 degrees of freedom.
Open StatCrunch → Select Stat → Calculator → T → Standard→Input DF = 19 and P (X ≥___)
= 0.95 → compute and record results.
t-value = 1.7291328
(b) Find the t -value such that the area left of the t -value is 0.02 with 6 degrees of freedom.
Open StatCrunch → Select Stat → Calculator → T → Standard→Input DF = 6 and P (X ≤___) =
0.028 → compute and record results.
t-value = -2.3637046
(c) Find the critical t -value that corresponds to 90% confidence. Assume 12 degrees of
freedom.
Open StatCrunch → Select Stat → Calculator → T → Between→ Input DF and
P (___≤ X ≤___) = 0.90→ compute and record results.
t-value = ±1.7823
In general, the population standard deviation is unknown for estimating a population mean based
on a sample mean. The t -distribution is used to off-set the additional variability introduced by using
s in place of  .
Objective C :Confidence Interval for a Population Mean
Constructing a 1    100% Confidence Interval for 
Point estimate  margin of error
s
s
where E  t  / 2 
.
x  t /2 
n
n
provided the data come from a population that is normally distributed, or the sample size is
large.
10
Example 1: A simple random sample of size n < 30has been obtained. From the normal probability
plot and boxplot, judge whether a t -interval should be constructed.
(a)
All the data lie within the bounds of the normality probability plot, indicating
the data could come from a population that is normally distributed.
The box plot looks like an almost bell-shaped curve without outliers. The
requirements for constructing a t- interval are satisfied.
(b)
One data point lies outside the bounds of the normality probability plot, indicating the data may
not come from a population that is normally distributed.
The box plot is skewed to the left. t-interval should not be used.
Example 2: A simple random sample of size n is drawn from a population that is normally distributed.
The sample mean, x , is found to be 50, and the sample standard deviation, s , is found to be 8.
(a) Use StatCrunch to construct a 98% confidence interval for  if the sample size, n , is 20.
Open StatCrunch→ Select Stat → T Stats → One sample → with summary → Input Sample
mean 50, Sample std. dev. 8, Sample size 20
98% confidence interval results:
μ : Mean of population
11
Mean Sample Mean Std. Err. DF L. Limit U. Limit
μ
50
1.7888544 19 45.457234 54.542766
𝐸 = (𝑈. 𝐿𝑖𝑚𝑖𝑡 − 𝐿. 𝐿𝑖𝑚𝑖𝑡) ÷ 2 = (54.542766 - 45.457234)/2 = 4.545
(b) Use StatCrunch to construct a 98% confidence interval for  if the sample size, n , is 15.
How does decreasing the sample size affect the margin of error, E ?
98% confidence interval results:
μ : Mean of population
Mean Sample Mean Std. Err. DF L. Limit U. Limit
μ
50
2.0655911 14 44.578868 55.421132
𝐸 = (𝑈. 𝐿𝑖𝑚𝑖𝑡 − 𝐿. 𝐿𝑖𝑚𝑖𝑡) ÷ 2 = (55.421132 - 44.578868)/2 = 5.421132
Decreasing the sample size increases the margin of error.
(c) Construct a 95% confidence interval for  if the sample size, n , is 20.
95% confidence interval results:
μ : Mean of population
Mean Sample Mean Std. Err. DF L. Limit U. Limit
μ
50 1.7888544 19 46.255885 53.744115
Compare the results to those obtained in part (a).
How does decreasing the level of confidence affect the margin of error, E ?
𝐸 = (𝑈. 𝐿𝑖𝑚𝑖𝑡 − 𝐿. 𝐿𝑖𝑚𝑖𝑡) ÷ 2 = (53.744115 - 46.255885)/2 = 3.744115
Decreasing the level of confidence decreases the margin of error.
(d) Could we have computed the confidence intervals in parts (a) to (c) if the population
had not been normally distributed? Why?
The requirement of using a t-interval of 𝑛 < 30 is the sample data must be
obtained from a population that is normally distributed in order to guarantee the
sample data are approximately normally distributed.
(Recall: The t - distribution has the same general symmetric bell shape)
If the population had not been normally distributed, there was no guaranteed that
𝑥̅ would have been normally distributed with 𝑛 = 15 or 20 (small sample size).
Thus we could not compute the t-interval.
Example 3: Determine the point estimate of the population mean and margin of error for the
following confidence interval.
Lower bound: 5
Upper bound: 23
12
The point estimate of the population mean is (5+23)/2 = 28/2 = 14 <-- Similar to finding
the midpoint between two endpoints.
The margin of error is (23-5)/2 = 18/2 = 9
Example 4 : How much time do Americans spend eating or drinking? Suppose for a random
sample of 1001 Americans age 15 or older, the mean amount of time spent eating
or drinking per day is 1.22 hours with a standard deviation of 0.65 hour.
(a) A histogram of time spent eating and drinking each day is skewed right. Use this
result to explain why a large sample size is needed to construct a confidence interval
for the mean time spent eating and drinking each day.
Since the population distribution is not normally distributed, 𝑥̅ is guaranteed to be
normally distributed when 𝑛 ≥ 30. Thus a large sample size is necessary to guarantee
that the mean amount of time spent eating or drinking per day is normally distributed.
(b) Use StatCrunch to determine and a 95% confidence interval for the mean amount of
time Americans age 15 or older spend eating and drinking each day.
Interpret the interval.
95% confidence interval results:
μ : Mean of population
Standard deviation = 0.65
Mean n Sample Mean Std. Err.
L. Limit U. Limit
μ
1001
1.22 0.020544535 1.1797335 1.2602665
The 95% level of confidence interval for µ is between 1.1797 and 1.2603. We are 95%
confident that the percentage of Americans, age 15 or older, spend between 1.18 hours
and 1.26 hours eating and drinking each day.
(c) Could the interval be used to estimate the mean amount of time a 9-year-old American
spends eating and drinking each day? Explain
No, our sample data statistics is pertaining to 15 year old Americans, only.
Objective D : Determining the Sample Size n
The sample size required to estimate the population mean,  , with a level of confidence
1    100% within a specified margin of error,
E , is given by
 Z s 
n    /2 
 E 
where n is rounded up to the nearest whole number.
2
Note: The t -distribution approaches the standard normal z - distribution as the sample size increases.
13
Example 1: A researcher wanted to determine the mean number of hours per week(Sunday through
Saturday) the typical person watches television. Results from the Sullivan Statistics
Survey indicate that s  7.5 hours.
(a) How many people are needed to estimate the number of hours people watch television
per week within 2 hours with 95% confidence?
S = 7.5, E = 2, 95% level of confidence
Open StatCrunch → Stat → Z Stats → One Sample → Power/Sample Size → Select confidence
interval Confidence Interval: 0.95, Standard deviation: 7.5 and width of 4 → compute and
record results.
55 people are needed to estimate the number of hours people watch television per week
within 2 hours with 95% confidence.
(b) How many people are needed to estimate the number of hours people watch television
per week within 1 hour with 95% confidence?
S = 7.5, E = 1, 95% level of confidence
Open StatCrunch → Stat → Z Stats → One Sample → Power/Sample Size → Select confidence
interval Confidence Interval: 0.95, Standard deviation: 7.5 and width of 2 → compute and
record results.
14
217 people are needed to estimate the number of hours people watch television per week
within 1 hour with 95% confidence.
(c) What effect does doubling the required accuracy have on the sample size?
217/55 = 3.94545...
Quadruple the sample size.
Chapter 9
Estimating a Population Standard Deviation (Supplementary Materials)
Objective A : Point Estimate
The best point estimate of the population variance,  2 , is the sample variance, s 2 .
Objective B : Chi-Square Distribution
15
Example 1: Use StatCrunch to find the critical values  12 / 2 and  2 / 2 for the given level of confidence
and sample size.
(a) 90% confidence, n  23
Open StatCrunch Stat  Calculators  Chi-Square  DF 22 (n-1) 
Between P(___≤x ≤ ____) = 0.90  compute and record.
The critical values are 12.338 and 33.924.
16
(b) 99% confidence, n  15
Open StatCrunch Stat  Calculators  Chi-Square  DF 14 (n-1), P(___≤x ≤ ____) =
0.99 The critical values are 4.0747 and 31.3193.
Objective C : Confidence Interval for a Population Variance or Standard Deviation
(1   ) 100% of the values of  will lie between 
2
2
1 / 2
and   / 2 .
2
( Recall:  
2
(n  1) s 2
2
)
To find a (1   ) 100% confidence interval about  , take the square root of the lower bound and upper bound.
Example 1:
A simple random sample of size n is drawn from a population that is known to be normally
distributed. The sample variance, s 2 , is determined to be 19.8.
(a) Use StatCrunch to construct a 95% confidence interval for  2 if the sample size, n , is 10.
Open StatCrunch  Stat  Variance Stats  One Sample  with summary  Sample
variance: 19.8, sample size: 10  Confidence interval for 𝜎 2 : 0.95  compute and record the
results.
95% confidence interval results:
σ2 : Variance of population
Variance Sample Var. DF L. Limit U. Limit
σ2
19.8
9 9.367722 65.99048
17
(b) Use StatCrunch to construct a 95% confidence interval for  2 if the sample size, n , is 25.
How does increasing the sample size affect the width of the interval?
Open StatCrunch  Stat  Variance Stats  One Sample  with summary  Sample
variance: 19.8, sample size: 25  Confidence interval for 𝜎 2 : 0.95  compute and record the
results.
95% confidence interval results:
σ2 : Variance of population
Variance Sample Var. DF L. Limit U. Limit
σ2
19.8
24 12.07192 38.319026
Increasing the sample size decreases the width of the interval for 𝜎 2 from
n = 10 (9.3677, 65.9905) to n = 25 (12.0719, 38.3190)
(c) Use StatCrunch to construct a 99% confidence interval for  2 if the sample size, n , is 10.
Compare the results with those obtained in part (a). How does increasing the level of
confidence affect the width of the confidence interval?
Open StatCrunch  Stat  Variance Stats  One Sample  with summary  Sample
variance: 19.8, sample size: 10  Confidence interval for 𝜎 2 : 0.99  compute and record the
results.
99% confidence interval results:
σ2 : Variance of population
Variance Sample Var. DF L. Limit U. Limit
σ2
19.8
9 7.5542562 102.71291
Increasing the level of confidence increases the width of the interval for 𝜎 2 from
95% confidence (9.3677, 65.9905) to 99% confidence (7.5543, 102.7129)
Example 2:
Travelers per taxes for flying, car rentals, and hotels. The following data represent the total
travel tax for a 3-day business trip in eight randomly selected cities. It was verified that the
data are normally distributed. Use StatCrunch to construct a 90% confidence interval for the
standard deviation travel tax for a 3-day business trip. Interpret the interval.
Open StatCrunch  Stat  Input given data  Summary Statistics  Columns Var1
Variance compute and record the results
Summary statistics:
Column Variance
var1
151.87187
18
Open StatCrunch  Stat  Variance Stats  One Sample  with summary 
Sample variance: 151.87187, sample size: 8  Confidence interval for 𝜎 2 : 0.90 
compute and record the results.
Alternative way: Open StatCrunch  Stat  Input given data  Variance Stats 
One Sample  with data Columns Var1 Confidence interval for 𝜎 2 : 0.90 
compute and record the results.
90% confidence interval results:
σ2 : Variance of population
Variance Sample Var. DF L. Limit U. Limit
σ2
151.87187 7 75.573504 490.50829
Manually compute the square root of each limit.
√75.573504= 8.69330224943 Lower Limit
√490.50829= 22.1474217461 Upper Limit
We are 90% confidence that the standard deviation travel tax for a 3-day
business trip is between 8.693 and 22.147.
19