Download Copyright 2010 John Wiley & Sons, Inc.

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Confidence interval wikipedia , lookup

German tank problem wikipedia , lookup

Transcript
Business Statistics, 6th ed.
by Ken Black
Chapter 8
Estimating Parameters
for Single Populations
Copyright2010
2010John
JohnWiley
Wiley&&Sons,
Sons,Inc.
Inc.
Copyright
1
Learning Objectives
Know the difference between point and interval
estimation.
Estimate a population mean from a sample mean
when s is known.
Estimate a population mean from a sample mean
when s is unknown.
Copyright 2010 John Wiley & Sons, Inc.
2
Learning Objectives
Estimate a population proportion from a sample
proportion.
Estimate the population variance from a sample
variance.
Estimate the minimum sample size necessary to
achieve given statistical goals.
Copyright 2010 John Wiley & Sons, Inc.
3
Estimating the Population Mean
A point estimate is a static taken from a sample that
is used to estimate a population parameter
Interval estimate - a range of values within which
the analyst can declare, with some confidence, the
population lies
Copyright 2010 John Wiley & Sons, Inc.
4
Confidence Interval to Estimate
 when s is Known
Point estimate
x
x
n
xz
Interval Estimate
n
or
xz
Copyright 2010 John Wiley & Sons, Inc.
s
s
n
   xz
s
n
5
Distribution of Sample Means
for 95% Confidence
.025
.025
95%
.4750
.4750

X
Z
-1.96
Copyright 2010 John Wiley & Sons, Inc.
0
1.96
6
Estimating the Population Mean
For a 95% confidence interval
α = .05
α/2 = .025
Value of α/2 or z.025 look at the standard normal
distribution table under
.5000 - .0250 = .4750
From Table A5 look up .4750, and read 1.96 as the
z value from the row and column
Copyright 2010 John Wiley & Sons, Inc.
7
Estimating the Population Mean
α is used to locate the Z value in constructing the
confidence interval
The confidence interval yields a range within which
the researcher feel with some confidence the
population mean is located
Z score – the number of standard deviations a value
(x) is above or below the mean of a set of numbers
when the data are normally distributed
Copyright 2010 John Wiley & Sons, Inc.
8
95% Confidence Intervals for 
95%

X
X
X
X
X
X
X
Copyright 2010 John Wiley & Sons, Inc.
9
95% Confidence Interval for 
x  510, s  46, n  85, z / 2  1.96
x  z / 2
s
   x  z / 2
s
n
n
46
46
510  1.96
   510  1.96
85
85
510  9.78    510  9.78
500.22    519.78
Copyright 2010 John Wiley & Sons, Inc.
10
Demonstration Problem 8.1
A survey was taken of U.S. companies that do
business with firms in India. One of the questions
on the survey was: Approximately how many years
has your company been trading with firms in India?
A random sample of 44 responses to this question
yielded a mean of 10.455 years. Suppose the
population standard deviation for this question
is 7.7 years. Using this information, construct a 90%
confidence interval for the mean number of years that
a company has been trading in India for the population
of U.S. companies trading with firms in India.
Copyright 2010 John Wiley & Sons, Inc.
11
Demonstration Problem 8.1
x  10.455, s  7.7, n  44.
90% confidence  z  1.645
xz
s
   xz
s
n
n
7.7
7.7
10.455  1.645
   10.455  1.645
44
44
10.455  1.91    10.455  1.91
8.545    12.365
Copyright 2010 John Wiley & Sons, Inc.
12
Demonstration Problem 8.2
A study is conducted in a company that employs 800
engineers. A random sample of 50 engineers reveals
that the average sample age is 34.3 years.
Historically, the population standard deviation of the
age of the company’s engineers is approximately 8
years. Construct a 98% confidence interval to
estimate the average age of all the engineers in this
company.
Copyright 2010 John Wiley & Sons, Inc.
13
Demonstration Problem 8.2
x  34.3, s  8, N = 800, and n  50.
98% confidence  z  2.33
xz
s
n
N n
s
   xz
N 1
n
N n
N 1
8
800  50
8
34.3  2.33
   34.3  2.33
50 800  1
50
34.3  2.554    34.3  2.554
31.75    36.85
Copyright 2010 John Wiley & Sons, Inc.
800  50
800  1
14
Estimating the Mean of a Normal
Population: Unknown s
The population has a normal distribution.
The value of the population Standard Deviation is
unknown, then sample Std Dev must be used in the
estimation process.
z distribution is not appropriate for these conditions
when the Population Std Dev is unknown, t
distribution is appropriate, and you use the Sample
Std Dev in the t formula
Copyright 2010 John Wiley & Sons, Inc.
15
t Distribution
A family of distributions -- a unique distribution for
each value of its parameter, degrees of freedom (d.f.)
Symmetric, Unimodal, Mean = 0, Flatter than a z
t distribution is used instead of the z distribution for
doing inferential statistics on the population mean
when the population Std Dev is unknown and the
population is normally distributed
With the t distribution, you use the Sample Std Dev
Copyright 2010 John Wiley & Sons, Inc.
16
t Distribution
A family of distributions - a unique distribution for
each value of its parameter using degrees of freedom
(d.f.)
Symmetric, Unimodal, Mean = 0, Flatter than a z
t formula
x
t 
s
n
Copyright 2010 John Wiley & Sons, Inc.
17
t Distribution Characteristics
t distribution – flatter in middle and have more
area in their tails than the normal distribution
t distribution approach the normal curve as n becomes
larger
t distribution is to be used when the population variance
or population Std Dev is unknown, regardless of the size
of the sample
Copyright 2010 John Wiley & Sons, Inc.
18
Reading the t Distribution
t table uses the area in the tail of the distribution
Emphasis in the t table is on α, and each tail of the
distribution contains α/2 of the area under the curve
when confidence intervals are constructed
t values are located at the intersection of the df
value and the selected α/2 value
Copyright 2010 John Wiley & Sons, Inc.
19
Confidence Intervals for  of a
Normal Population: Unknown s
x  t / 2,n 1
s
n
or
x  t / 2,n 1
s
s
   x  t / 2,n 1
n
n
df  n  1
Copyright 2010 John Wiley & Sons, Inc.
20
Table of Critical Values of t
df
1
2
3
4
5
t0.100 t0.050 t0.025 t0.010 t0.005
3.078
1.886
1.638
1.533
1.476
6.314
2.920
2.353
2.132
2.015
12.706
4.303
3.182
2.776
2.571
31.821
6.965
4.541
3.747
3.365
63.656
9.925
5.841
4.604
4.032
1.714
25
1.319
1.318
1.316
1.708
2.069
2.064
2.060
2.500
2.492
2.485
2.807
2.797
2.787
29
30
1.311
1.310
1.699
1.697
2.045
2.042
2.462
2.457
2.756
2.750
40
60
120
1.303
1.296
1.289
1.282
1.684
1.671
1.658
1.645
2.021
2.000
1.980
1.960
2.423
2.390
2.358
2.327
2.704
2.660
2.617
2.576
23
24

1.711
Copyright 2010 John Wiley & Sons, Inc.


t
With df = 24 and  = 0.05,
t = 1.711.
21
Confidence Intervals for  of a
Normal Population: Unknown s
s
xt
n
or
s
s
x t
   xt
n
n
df  n  1
Copyright 2010 John Wiley & Sons, Inc.
22
Demonstration Problem 8.3
The owner of a large equipment rental company wants to
make a rather quick estimate of the average number of days a
piece of ditch digging equipment is rented out per person per
time. The company has records of all rentals, but the amount
of time required to conduct an audit of all accounts would be
prohibitive. The owner decides to take a random sample of
rental invoices. Fourteen different rentals of ditch diggers are
selected randomly from the files, yielding the following data.
She uses these data to construct a 99% confidence interval to
estimate the average number of days that a ditch digger is
rented and assumes that the number of days per rental is
normally distributed in the population.
31325121421311
Copyright 2010 John Wiley & Sons, Inc.
23
Solution for Demonstration Problem 8.3
x  2.14, s  1.29, n  14, df  n  1  13
 1  .99

 0.005
2
2
t .005,13  3.012
s
s
x t
   xt
n
n
1.29
1.29
2.14  3.012
   2.14  3.012
14
14
2.14  1.04    2.14  1.04
1.10    3.18
Copyright 2010 John Wiley & Sons, Inc.
24
MINITAB Solution for
Demonstration Problem 8.3
Copyright 2010 John Wiley & Sons, Inc.
25
Comp Time: Excel Normal View
Copyright 2010 John Wiley & Sons, Inc.
26
Confidence Interval to Estimate
the Population Proportion
Estimating the population proportion often
must be made
pˆ  z
2
pˆ qˆ
 p  pˆ  z
n
2
pˆ qˆ
n
where :
pˆ = sample proportion
qˆ = 1 - pˆ
p = population proportion
n = sample size
Copyright 2010 John Wiley & Sons, Inc.
27
Demonstration Problem 8.5
A clothing company produces men’s jeans. The jeans
are made and sold with either a regular cut or a boot
cut. In an effort to estimate the proportion of their
men’s jeans market in Oklahoma City that prefers
boot-cut jeans, the analyst takes a random sample
of 212 jeans sales from the company’s two Oklahoma
City retail outlets. Only 34 of the sales were for
boot-cut jeans. Construct a 90% confidence interval
to estimate the proportion of the population in
Oklahoma City who prefer boot-cut jeans.
Copyright 2010 John Wiley & Sons, Inc.
28
Solution for Demonstration Problem 8.5
x
34
ˆ 
n  212, x  34, p

 0.16
n
212
ˆ = 1- p
ˆ  1  0.16  0.84
q
90% Confidence  z  1.645
pˆ  z
pˆ qˆ
 p  pˆ  z
n
pˆ qˆ
n
(0.16)(0.84)
(0.16)(0.84)
0.16  1.645
 p  0.16  1.645
212
212
0.16  0.04  p  0.16  0.04
0.12  p  0.20
Copyright 2010 John Wiley & Sons, Inc.
29
Estimating the Population Variance
Population Parameter s
Estimator of s
( x  x)
s 
n 1
2
2
 formula for Single Variance
 
2
(n  1) s
s
2
2
degrees of freedom  n - 1
Copyright 2010 John Wiley & Sons, Inc.
30
Confidence Interval for s2
n  1s

2

2
2
s
2

n  1s


2
2
1

2
df  n  1
  1  level of confidence
Copyright 2010 John Wiley & Sons, Inc.
31
Two Table Values of 2
df = 7
.05
.95
.05
0
2
4
6
8
10
2.16735
Copyright 2010 John Wiley & Sons, Inc.
12
14
16
18
20
df
1
2
3
4
5
6
7
8
9
10
0.950
3.93219E-03
0.102586
0.351846
0.710724
1.145477
1.63538
2.16735
2.73263
3.32512
3.94030
0.050
3.84146
5.99148
7.81472
9.48773
11.07048
12.5916
14.0671
15.5073
16.9190
18.3070
20
21
22
23
24
25
10.8508
11.5913
12.3380
13.0905
13.8484
14.6114
31.4104
32.6706
33.9245
35.1725
36.4150
37.6525
14.0671
32
90% Confidence Interval for s2
s 2  .0022125, n  8, df  n  1  7,   .10
 2   .21   .205  14.0671
2
2
 2    2 .1   .295  2.16735
1
1
2
2
______________________________________
(n  1) s 2
 2
2
s 2 
(n  1) s 2
2
1
2
(8  1).0022125
(8  1).0022125
s 2 
14.0671
2.16735
.001101  s 2  .007146
Copyright 2010 John Wiley & Sons, Inc.
33
Demonstration Problem 8.6
The U.S. Bureau of Labor Statistics publishes data on the hourly
compensation costs for production workers in manufacturing
for various countries. The latest figures published for Greece
show that the average hourly wage for a production worker in
manufacturing is $16.10. Suppose the business council of
Greece wants to know how consistent this figure is. They
randomly select 25 production workers in manufacturing from
across the country and determine that the standard deviation
of hourly wages for such workers is $1.12. Use this information
to develop a 95% confidence interval to estimate the
population variance for the hourly wages of production
workers in manufacturing in Greece. Assume that the hourly
wages for production workers across the country in
manufacturing are normally distributed.
Copyright 2010 John Wiley & Sons, Inc.
34
Solution for Demonstration Problem 8.6
s 2  1.2544, n  25, df  n  1  24,   .05



  .05  
2
2
2
2
2
1


2
2
1
.05
2
2
.025

n  1s 2

2

 39.3641
2
.975
 12.4011
s
2

n  1s 2


2
25  1(1.2544) 
s
0.7648  s
2
39.3641
Copyright 2010 John Wiley & Sons, Inc.
2

2
1

2
25  1(1.2544)
12.4011
 2.4277
35
Determining Sample Size when Estimating
It may be necessary to estimate the sample size
when working on a project
In studies where µ is being estimated, the size of the
sample can be determined by using the z formula for
sample means to solve for n
Difference between x and µ is the error of estimation
Error of Estimation = ( - µ)
x
Copyright 2010 John Wiley & Sons, Inc.
36
Determining Sample Size when Estimating 
z formula
z
x
s
n
Error of Estimation (tolerable error) E  x  
z s
n
E
2
Estimated Sample Size
Estimated s
s
Copyright 2010 John Wiley & Sons, Inc.
1
range
4
2
2
2
 z s

 E

2
2




37
Sample Size When Estimating : Example
E  1, s  4
90% confidence  z  1.645
z s
n
E
2
2
2
2
(1.645) 2 ( 4) 2

12
 43.30 or 44
Copyright 2010 John Wiley & Sons, Inc.
38
Demonstration Problem 8.7
Suppose you want to estimate the average age of all
Boeing 737-300 airplanes now in active domestic U.S.
service. You want to be 95% confident, and you want
your estimate to be within one year of the actual
figure. The 737-300 was first placed in service about
24 years ago, but you believe that no active 737-300s
in the U.S. domestic fleet are more than 20 years old.
How large of a sample should you take?
Copyright 2010 John Wiley & Sons, Inc.
39
Solution for Demonstration Problem 8.7
E  2, range  25
95% confidence  z  1.96
1
1
estimated s : range   25  6.25
4
 4
zs
E
2
n
2
2
(1.96) 2 (6.25) 2

22
 37.52 or 38
Copyright 2010 John Wiley & Sons, Inc.
40
Determining Sample Size when Estimating p
z formula
ˆp
p
Z 
pq
n
Error of Estimation (tolerable error)
Estimated Sample Size
Copyright 2010 John Wiley & Sons, Inc.
n
E  pˆ  p
2
z pq
E
2
41
Demonstration Problem 8.8
Hewitt Associates conducted a national survey to
determine the extent to which employers are
promoting health and fitness among their employees.
One of the questions asked was, Does your company
offer on-site exercise classes? Suppose it was
estimated before the study that no more than 40% of
the companies would answer Yes. How large a sample
would Hewitt Associates have to take in estimating
the population proportion to ensure a 98% confidence
in the results and to be within .03 of the true
population proportion?
Copyright 2010 John Wiley & Sons, Inc.
42
Solution for Demonstration Problem 8.8
E  0.03
98% Confidence  Z  2.33
estimated P  0.40
Q  1  P  0.60
z 2 pq
n 2
E
(2.33) 2 (0.40)(0.60)

2
(.003)
 1,447.7 or 1,448
Copyright 2010 John Wiley & Sons, Inc.
43