Download Chapter 8: Sampling Distributions and Estimation

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Misuse of statistics wikipedia , lookup

Resampling (statistics) wikipedia , lookup

German tank problem wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Chapter 8: Sampling Distributions and Estimation
Chapter Objectives
When you finish this chapter you should be able to










define sampling variation, sampling error, parameter and estimator.
explain why it is desirable that an estimator be unbiased, consistent and efficient.
state the Central Limit Theorem for a mean or proportion.
explain how the standard error is affected by sample size.
construct a 90, 95 or 99 percent confidence interval for a mean or proportion.
describe similarities and differences between z and Student’s t.
find t-values in tables or Excel for a desired confidence level.
calculate sample size for a given precision and confidence level to estimate μ or π.
construct a confidence interval for a difference of two means or proportions.*
construct a confidence interval for a variance.*
*These topics are optional, but may be helpful in later chapters.
Quiz Yourself
True/False Questions
T F 1.
The standard error of the mean is the standard deviation of the sampling distribution of the
sample mean.
T F 2.
A sample of size n is selected at random from an infinite population. As n increases, the
standard error of the sample mean increases.
T F 3.
The fill of two liter bottles of a soft drink has a mean of 2 liters and a standard deviation of 0.1
liters. For samples of 36 bottles, the mean of the sampling distribution of x is 2 liters.
T F 4.
The fill of two liter bottles of a soft drink has a mean of 2 liters and a standard deviation of 0.1
liters. For samples of 36 bottles, the standard deviation of the sampling distribution of x is
0.00278 liters.
T F 5.
Because of the Central Limit Theorem, a business person only needs to know the mean and
standard deviation of the population before performing any inferential statistics.
T F 6.
The normal distribution is a good approximation of the sampling distribution for the sample
proportion when n = 50 and p = 0.05.
T F 7.
An unbiased estimator is said to be consistent if the difference between the estimator and the
parameter grows larger as the sample size grows larger.
T F 8.
The sample variance is a point estimate of the population variance.
T F 9.
The range of a confidence interval is a measure of the expected sampling error.
T F 10. In general, increasing the confidence level, 1-α , will narrow the interval, and decreasing it
widens the interval.
Multiple Choice Questions
1.
For a sample size of 1, the sampling distribution of the mean will be normally distributed
A.
B.
C.
D.
2.
The expected value of the sampling distribution of the sample mean equals the population mean
A.
B.
C.
D.
3.
C.
D.
1.
B.
0.50.
C.
less than 0.50.
D.
0.
One characteristic of any Student’s t distribution is
A.
B.
C.
D.
8.
the sample size is greater than 30
the population variance is known
the population is normal
the sample is drawn from a positively skewed distribution
A random variable follows the Student’s t distribution. The probability that it will be positive is
A.
7.
When μ and σ are known, the population will be approximately normally distributed.
If a population has μ and σ, a sample from that population will be normally distributed if
the sample size is large enough.
When we know σ, the variation in the sample means will be equal to that of the
population.
Means of samples of n=30 from an exponential distribution will be approximately
normally distributed.
The use of the Student’s t distribution requires which of the following assumptions?
A.
B.
C.
D.
6.
It will double.
It will be cut in half.
It will be cut to ¼ of 5.
It will quadruple.
Which of the following statements is consistent with the Central Limit Theorem?
A.
B.
5.
when the population is normally distributed
when the population is symmetric
when the population size N > 30
for all populations
The standard error of the sample mean is equal to 5 when n=25. If the sample size increases by a
factor of four, how will the standard error change?
A.
B.
C.
D.
4.
regardless of the shape of the population.
only if the shape of the population is positively skewed.
only if the population values are larger than 30.
only if the population is normally distributed.
it is right skewed.
as n increases, the t-distribution approaches a uniform distribution.
it is described by its degrees of freedom.
it has a mean of 0 and a standard deviation of 1.
As a general rule, the normal distribution is used to approximate the sampling distribution of the
sample proportion only if
A.
the sample size n is greater than 30.
B.
the population proportion p is close to 0.50.
C.
D.
9.
The distribution of the sample proportion has a standard error of 0.03 for a sample of size n=100.
The population proportion must be
A.
B.
C.
D.
10.
p approaches 0
p approaches 0.50
p approaches 1.00
n increases
What does it mean to have 95% confidence in an interval estimate?
A.
B.
C.
D.
12.
0.1 or 0.9
0.2 or 0.8
0.4 or 0.6
0.5
The standard error of the proportion will become larger as
A.
B.
C.
D.
11.
the underlying population is normal.
np and n(1-p) are both greater than or equal to 10.
In repeated sampling, the population parameter would fall in the given interval 95% of
the time.
In repeated sampling, 95% of the intervals would contain the population parameter.
In repeated sampling, 95% of the population observations fall within the given interval.
In repeated sampling, 95% of the point estimates fall within the given interval.
Ceteris paribus, which is narrower, a 95% confidence interval with n=100 or a 99% confidence
interval with n=30?
A.
B.
C.
D.
the 95% confidence interval
the 99% confidence interval
They are the same width.
Need the margin of error to tell.
A study of 200 insomniacs paid for by the Serta Mattress Company found that the average insomniac
counted 350 sheep before falling asleep, with a standard deviation of 120. An insomniac is a person who
has difficulty falling asleep. Use this information to answer the following FIVE questions. Some useful
numbers might be:
13.
=NORMSINV(0.89)
1.2265
=NORMSINV(0.945)
1.5982
=TINV(0.89,199)
0.1385
=TINV(0.11,199)
1.6053
=TINV(0.055,199)
1.9302
Calculate an 89% confidence interval for the true mean number of sheep counted by insomniacs.
A.
B.
C.
D.
350±10.41
350±13.56
350±13.62
350±16.38
14.
Suppose the confidence interval in the previous question was [333, 367]. Which is the correct
interpretation of this interval?
A.
B.
C.
D.
15.
If the upper limit of the confidence interval calculated in question 13 was 370, what is the margin
of error for this confidence interval?
A.
16.
B.
10
C.
8.5
D.
4.25
217
B.
310
C.
368
D.
458
Out of the 200 insomniacs, 98 reported regularly watching The Late Show with David Letterman
before they began to count sheep. Calculate the margin of error for a 78% confidence interval of
the true proportion of insomniacs who regularly watch David Letterman before counting sheep.
A.
18.
20
If the Serta Company wished to estimate the mean number of sheep counted by insomniacs to
within 10 sheep, at a confidence level of 89%, how many insomniacs would they need to survey?
A.
17.
Eighty-nine percent of the insomniacs in the study counted between 333 and 367 sheep
before falling asleep.
In the population of insomniacs, eighty -nine percent of them will count between 333 and
367 sheep before falling asleep.
We are eighty-nine percent confident that the mean number of sheep counted by the
studied insomniacs falls into the interval [333, 367].
We are eighty-nine percent confident that the interval [333, 367] includes the true mean
number of sheep counted by insomniacs.
0.043
B.
0.056
C.
0.136
D.
0.164
Construct a 95% confidence interval estimate for the difference between the means of two
normally distributed populations, where the unknown population variances are assumed not to be
equal. Summary statistics computed from two independent samples are as follows: n1  50 ,
x1  175 , s1  18.5 , n2  42 , x2  158 , and s2  32.4 . The upper confidence limit is:
A.
19.
B.
28.212
C.
24.911
D.
5.788
A random sample of 25 observations is selected from a normally distributed population. The
sample variance is 10. In the 95% confidence interval for the population variance, the upper limit
will be:
A.
20.
19.123
17.110
B.
6.097
C.
17.331
D.
19.353
You have constructed a confidence interval for the population mean, however you think it is not
sufficiently precise. To correct this, you may
A.
B.
C.
D.
increase the population standard deviation
increase the sample size
increase the level of confidence
increase the sample mean
Solved Problems From Text
8.2
a.
b.
c.
8.6
  1.96
  1.96
  1.96
Exam 1:   1.96
Exam 2:   1.96

n

n

n

n

n
= 200  1.960
12
: 196.08, 203.92
36
= 1000  1.960
15
: 990.2, 1009.80
9
= 50  1.960
1
: 49.608, 50.392
25
= 75  1.96
= 79  1.96

7
10
7
: 70.6614, 79.3386
: 74.6614, 83.3386
10
7
= 65  1.96
: 60.6614, 69.3386
n
10
Yes, the confidence intervals overlap. This suggests that all three exams had the same population
mean.
Exam 3:   1.96
8.8
a.
b.
c.
3
s
=24 ± (=TINV(0.10,6)*( 3 )) = 24  1.9432
= (21.7966, 26.2034)
7
7
n
6
s
=42 ± (=TINV(0.01,17)*( 6 )) = 42  2.8982
= ( 37.9013, 46.0987)
 t
18
18
n
14
s
= 119± (=TINV(0.05,27)*( 14 )) = 119  2.0518
= (113.5714,
 t
28
28
n
 t
124.4286)
8.10
a.
b.
c.
8.16
a.
b.
Excel = TINV(.05, 40) = 2.0211
Excel = TINV(.05, 80) = 1.9901
Excel = TINV(.05, 100) = 1.984
As n increases, t approaches z= 1.96.
0.5(1  0.5)
= SQRT(0.25/30) = 0.0913
30
0.2(1  0.2)
= SQRT(0.16/50)=0.0566
50
30*0.5= 15>10 , normality OK
50*0.2= 10; 50*0.80=40, normality
OK
c.
0.1(1  0.1)
=SQRT(0.09/100)= 0.0300
100
100*0.1=10; 100*0.9=90,
normality OK
d.
0.005(1  0.005)
=SQRT(0.004975/500) = 0.0032
500
fails.
500*0.005=2.5, normality
n=((1.96*7500)/2000)2 = 54.0225 = 55
n=((1.96*7500)/1000)2 = 216.09 = 217
n=((1.96*7500)/500)2 = 864.36 = 865
8.24
a.
b.
c.
8.30
Z score: =NORMSINV(0.01) = -2.326 (or =NORMSINV(0.99) = 2.326
(Z/E)2 =(2.326/0.025)^2 = 8659.03109
(Z/E)2 *(π(1-π)) =0.25*8659.031 = 2164.757772 or 2165
Sampling methods will vary depending on your opinion.
8.36
Assuming equal variances, ( x1  x 2 )  t
(n1  1) s12  (n2  1) s 22
n1  n2  2
1
1

n1 n2
(795-894.1667)±TINV(0.05,20)*SQRT((((9*9094.444)+(11*8135.606))/20)*((1/10)+(1/12)))
= -99.1667±39.6312
(-138.798, -59.5355) Undergrads pay less rent.
Descriptive Statistics from
Excel:
ν = (10-1)+(12-1)=20
8.40
(n  1) s 2
 u2
 
2
Mean
Standard Error
U
795
30.157
G
894.1667
26.0378
Median
Mode
Standard
Deviation
Sample Variance
Kurtosis
Skewness
Range
Minimum
Maximum
Sum
Count
795
#N/A
895
930
95.3648
9094.444
1.558811
0.849389
330
670
1000
7950
10
90.19759
8135.606
4.057804
1.535318
350
780
1130
10730
12
(n  1) s 2
 l2
a.
Confidence Interval for σ2 : [=((14*100)/CHIINV(0.025,14)) ,
=((14*100)/CHIINV(0.975,14))]
= (53.601 , 248.724)
b.
Confidence Interval for σ2 : [=((17*144)/CHIINV(0.025,17)) ,
=((17*144)/CHIINV(0.975,17))]
= (81.084, 323.630)
Quiz Yourself Answers
True/False
1 T
6
2 F
7
3 T
8
4 F
9
5 F 10
F
F
T
T
F
1
2
3
4
5
D
D
B
D
C
Multiple Choice
6 B
11 B
7 C
12 A
8 D
13 C
9 A
14 D
10 B
15 A
16
17
18
19
20
C
A
B
D
B