• Study Resource
• Explore

Survey

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts

Statistics wikipedia, lookup

History of statistics wikipedia, lookup

Transcript
```Doing Statistics for Business
Data, Inference, and Decision Making
Marilyn K. Pelosi
Theresa M. Sandifer
Chapter 7
Sampling
Distributions &
Confidence
Intervals
1
Chapter 7 Objectives
 Motivation for Point Estimators
 Common Point Estimators
 Desirable Properties of Point Estimators
 Distribution of the Sample Mean:
Large Sample or
Known 
2
Chapter 7 Objectives (con’t)
 The Central Limit Theorem - A More
Detailed Look
 Drawing Inferences by Using the Central
Limit Theorem
 Large Sample Confidence Intervals for the
Mean
3
Chapter 7 Objectives (con’t)
 Distribution of the Sample Mean:
Small Sample and
Unknown 
 Small Sample Confidence Intervals for the
Mean
 Confidence Intervals for Qualitative Data
 Sample Size Calculations
4
Figure 7.1 Relationship between
probability and inferential statistics
Probability
Population
Sample
Inferential
Statistics
5
A Point Estimate is a single number
calculated from sample data. It is used to
estimate a parameter of the population.
A Point Estimator is the formula or rule
that is used to calculate the point estimate
for a particular set of data.
6
TRY IT NOW!
Sales of CD’s
Comparing Point Estimators Calculated
From Samples Selected From Different
Populations
Use a software package such as Excel or Minitab to simulate picking a
sample of size n=10 from two different populations or use the
samples given on the following slide.
7
TRY IT NOW!
Sales of Gizmos
Comparing Point Estimators Calculated
From Samples Selected From Different
Populations (con’t)
.
Store 1
95
99
95
99
101
102
102
102
92
99
Store 2
97
97
103
103
101
98
98
106
100
96
8
TRY IT NOW!
Sales of Gizmos
Comparing Point Estimators Calculated
From Samples Selected From Different
Populations (con’t)
Suppose the first variable is daily sales of a new CD at Store 1. The
second variable is daily sales of the new CD at Store 2.
Select a sample of size n=10 days from both stores.
Assume that the days sales at both stores are normally distributed
with a mean of 100 and a standard deviation of 3.
Find X X and X - X
1
2
1
2
9
TRY IT NOW!
Sales of CDs
Comparing Point Estimators Calculated
From Samples Selected From Different
Populations (con’t)
What do you notice about the difference in sample means even though the
population means are the same?
Find
s12
(Store 1 standard deviation squared), s and
2
2
s12 / s22
What do you notice about the ratio of the two sample variances?
10
An Unbiased Estimator yields an estimate
that is fair. It neither systematically overestimates the parameter nor systematically
underestimates the parameter.
11
TRY IT NOW!
The Diaper Company
Comparing the Variability of Two
Point Estimators
Weights in grams for the next 5 hourly samples taken at the diaper
company are shown below:
Hour 6:
Hour 7:
Hour 8:
Hour 9:
Hour 10:
54.89
54.32
54.14
54.11
55.21
55.06
55.72
55.18
54.05
55.40
55.45
54.91
55.78
53.60
53.87
55.23
54.40
55.37
55.97
55.09
55.75
55.78
55.69
55.86
55.70
12
TRY IT NOW!
The Diaper Company
Comparing the Variability of Two
Point Estimators (con’t)
For each sample, calculate the sample mean and the sample median.
Find the average of the sample means and the average of the sample
medians.
Find the standard deviation of the sample means and the standard
deviation of the sample medians.
Which point estimator has less variability?
13
The probability distribution of a point
estimator or a sample statistic is called a
Sampling Distribution.
The Standard Error is the standard deviation
of the sampling distribution of a
point estimator. It measures how much
the point estimator or sample statistic varies
from sample to sample.
14
Central Limit Theorem (CLT)
In random sampling from a population, with
mean  and standard deviation , when n is
large enough, the distribution of
is
X
 approximately normal with
 a mean equal to  and
 a standard deviation equal to  / n
15
Figure 7.2 The Diaper Company
Histogram of Individual Diaper Weights
Histogram of Individual Diaper
Weights
Frequency
50
40
30
20
10
0
53.5 54
54.5 55
55.5 56
Weight (grams)
56.5
16
Figure 7.3 The Diaper Company
Histogram of 52 Sample Means
Histogram of Average Diaper
Weight
20
Frequency
15
10
5
0
53
53.5 54 54.5 55 55.5 56 56.5
Average Weight (grams)
17
Figure 7.4 Graphs of two populations
with the same mean but different
standard deviations
18
Figure 7.5 Dotplots of Sample Means
. ..
..:. :
-+---------+---------+---------+---------+---------+-----St Dev = 1
.
.
. . .
.
:
. .
-+---------+---------+---------+---------+---------+-----St Dev = 5
23.50
24.00
24.50
25.00
25.50
26.00
19
TRY IT NOW!
The Central Limit Theorem
Exploring the Third Point
Use a software package such as Excel or Minitab to simulate
picking 10 samples each of size n = 35 from two different populations:
Population 1: Monthly sales of a leading on-line bookstore- normally
distributed with a mean of  = 25 (\$1,000) and a standard deviation of
 = 1(\$1,000)
Population 2: Monthly sales of a leading on-line bookstore- normally
distributed with a mean of  = 25 (\$1,000) and a standard deviation of
 = 3 (\$1,000)
20
TRY IT NOW!
The Central Limit Theorem
Exploring the Third Point (con’t)
X
For each sample, calculate a sample mean,
.
Find the average and standard deviation of the 10 X ‘s from population
1 samples.
Find the average and standard deviation of the 10
samples.
X ‘s from population 2
21
TRY IT NOW!
The Central Limit Theorem
Impact of Sample Size on Standard Error
Using the random number table in Appendix A and the 350
values shown in the previous Try It Now! exercise in your textbook,
select 10 samples of size 5 from population 1. Review Section 2.4
if you need a refresher on how to use the random number table.
For each sample, calculate a sample mean, X
Find the average and standard deviation of the 10 X ‘s. Compare these
values to the corresponding values that you found in the previous Try It
Now! for population 1. In that case the sample size was n = 35.
22
68%




95%

99.7%

Figure 7.6A Sampling Distribution of X
23
Discovery Exercise 1.1
The Central Limit Theorem in Action
Part I. Draw a picture of a normal distribution
with mean of 80 and standard deviation of 5.
This is the population we will sample from.
24
Discovery Exercise 1.1
The Central Limit Theorem in Action
(con’t)
Part II. Generate and examine 100 random samples.
For this exercise you will need to generate 100 samples each consisting of
30 value selected from a normal distribution with a mean of 80 and a
standard deviation of 5.
Part III. Create a distribution of X for samples of
size n = 30.
25
Figure 7.6B Sampling
Distribution for X when
 = 55.00
Figure 7.6C Sampling
Distribution for X when
 = 54.50
68%



95%

68%


99.7%





95%

99.7%

26
TRY IT NOW!
Cost of Books
Comparing The Sample Mean to the
Claimed Population Mean
A university states the average student spends \$225 per semester on
books. Based on your own experience you feel that this is an
underestimate of the true expenditure. You ask 30 of your friends how
much they spent on textbooks this past semester and you obtain the
following data:
27
TRY IT NOW!
Cost of Books
Comparing The Sample Mean to the
Claimed Population Mean (con’t)
214
233
234
236
239
241
241
244
245
247
248
248
248
249
250
253
254
254
258
260
262
262
263
265
269
274
276
277
277
281
Based on these data, do you have reason to tell the university that its
statement is inaccurate?
28
A Confidence Interval or an Interval Estimate
is a range of values with an associated
probability or Confidence Level, 1 – . The
probability quantifies the chance that you
have an interval that contains the
true population parameter.
29
Figure 7.7. Normal
Distribution with 0.05 in
the tails.
30
TRY IT NOW!
The Bottle-Filling Problem
A sample of 36 bottles had a sample mean of x = 32.10 oz.
The population standard deviation, , was assumed to be 0.1 oz.
Find a 95% confidence interval for . How wide is the interval?
Now find a 98% confidence interval for .
Which interval is wider?
31
Figure 7.8 Comparison of Confidence Intervals
and µ
32
Discovery Exercise 7.2
Exploring Confidence Intervals for 
From a population of college students across the
United States, a sample was selected to find our how
many hours per week a typical student spends playing sports.
Part I. A random sample of 2500 students was
selected. The sample mean, x was found to be
12.5 hours. The population standard deviation, ,
is known to be 1.05 hours. Given this information,
find:
33
Discovery Exercise 7.2 (con’t)
Exploring Confidence Intervals for 
(a) a 90% confidence interval for .
(b) a 92% confidence interval for .
(c) a 94%confidence interval for .
(d) a 96% confidence interval for .
34
Discovery Exercise 7.2 (con’t)
Exploring Confidence Intervals for 
(e) a 98% confidence interval for 
(f) Discuss what happens to the size of the interval as the
level of the confidence increases.
35
Discovery Exercise 7.2 (con’t)
Exploring Confidence Intervals for 
Part II. A random sample of 2500 students was
selected. The sample mean,x was found to be
10.5 hours. The population standard deviation, ,
is known to be 1.05 hours. Given this information,
find:
(a) a 90% confidence interval for 
36
Discovery Exercise 7.2 (con’t)
Exploring Confidence Intervals for 
(b) a 92% confidence interval for .
(c)a 94%confidence interval for .
(d) a 96% confidence interval for 
(e) a 98% confidence interval for 
37
Discovery Exercise 7.2 (con’t)
Exploring Confidence Intervals for 
Compare the intervals found in Part I with those
found in Part II. Discuss what happened to the confidence
interval due to the change in the value of the sample mean x.
Part III. A random sample of 2500 students was
selected. The sample mean, x was found to be
12.5 hours. Suppose you that the population
standard deviation, , is actually 2.05 hours.
Given this information, find:
38
Discovery Exercise 7.2 (con’t)
Exploring Confidence Intervals for 
(a) a 90% confidence interval for 
(b) a 92% confidence interval for .
(c)a 94%confidence interval for .
(d) a 96% confidence interval for 
(e) a 98% confidence interval for 
39
Discovery Exercise 7.2 (con’t)
Exploring Confidence Intervals for 
Compare the intervals found in Part I with those
found in Part III. Discuss what happened to the
confidence intervals due to the change in the value of the
population standard deviation, .
Part IV. A random sample of 2000 students was
selected. The sample mean, x was found to be
12.5 hours. The population standard deviation, ,
is known to be 1.05 hours. Given this information,
find:
40
Discovery Exercise 7.2 (con’t)
Exploring Confidence Intervals for 
(a) a 90% confidence interval for 
(b) a 92% confidence interval for .
(c)a 94%confidence interval for .
(d) a 96% confidence interval for 
(e) a 98% confidence interval for 
41
Discovery Exercise 7.2 (con’t)
Exploring Confidence Intervals for 
Compare the intervals found in Part I with those
found in Part IV. Discuss what happened to the confidence
interval due to the change in the value of the sample size, n.
42
Figure 7.9 t-distribution with 5 Degrees
of Freedom
43
Figure 7.10
t-distribution with 25
Degrees of Freedom
Figure 7.11
t-distribution with 1 and
50 Degrees of Freedom
44
Upper Tail Areas
Degrees of
Freedom
20
21
22
23
24
25
26
0.25
0.6870
0.6864
0.6858
0.6853
0.6848
0.6844
0.6840
0.1
1.3253
1.3232
1.3212
1.3195
1.3178
1.3163
1.3150
0.05
1.7247
1.7207
1.7171
1.7139
1.7109
1.7081
1.7056
0.025
2.0860
2.0796
2.0739
2.0687
2.0639
2.0595
2.0555
0.01
2.5280
2.5176
2.5083
2.4999
2.4922
2.4851
2.4786
0.005
2.8453
2.8314
2.8188
2.8073
2.7970
2.7874
2.7787
Figure 7.12 A portion of the t table
45
TRY IT NOW!
Retirement Years
Confidence Interval for
A survey shows that a growing number of Americans are willing
to make sacrifices to become home owners despite increasing job and
financial worries. The Federal National Mortgage Association surveyed
1857 Americans and found that 67% would put off retirement for 10 years
to own a home.
Find a 90% confidence interval for the proportion of all Americans who
would put off retirement for 10 years to own a home.
46
TRY IT NOW!
Bottle Filling
Finding the Sample Size
How many bottles does the bottle manufacturer need to sample to
be 98% confident that the error is at most 0.002 oz? Remember that
the population standard deviation is 0.1 oz.
47
TRY IT NOW!
Retirement Years
Sample Size Calculation for 
How many Americans must be sampled to determine the percentage who
would put off retirement for 10 years to own a home? The estimate
should not differ from the actual population proportion by more than 3%
with a confidence of 90%.
48
Instructions for small sample confidence
interval for mean- all others are done similarly
Confidence Intervals>One Sample>Population Mean using t
The dialog box opens.
49
50
Finding Confidence Intervals Using Excel
(con’t)
1.
2.
3.
4.
5.
First, indicate the level of confidence as a
percent.
Select User Input if you already have the
summary statistics or Input Range if you have
raw data.
Indicate how the sampling was done
Indicate where you want the output to appear
Click OK
51
Output for Confidence Interval for Small Sample
usiing t
52
Chapter 7 Summary
In this chapter you have learned:
 The basics of estimating population parameters, in
particular how to estimate the average of a
numeric characteristic of a population, , and the
proportion of a population that has a certain
characteristic, .
 The estimates are calculated from a sample
selected from the population.
53
Chapter 7 Summary
 Each sample yields a slightly different estimate of
the population parameter.
 Thus, estimators are themselves random variables.
 When the random variable is an estimator, the
distribution is called a sampling distribution.
 The sampling distribution as a mean and a
standard deviation, called the standard error.
54