Download 2 - TonyReiter

Document related concepts

Bootstrapping (statistics) wikipedia , lookup

Foundations of statistics wikipedia , lookup

Resampling (statistics) wikipedia , lookup

German tank problem wikipedia , lookup

Student's t-test wikipedia , lookup

History of statistics wikipedia , lookup

Misuse of statistics wikipedia , lookup

Transcript
Statistics for Managers
Using Microsoft® Excel
5th Edition
Chapter 8
Confidence Interval Estimation
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.
Chap 8-1
Learning Objectives
In this chapter, you will learn:
 To construct and interpret confidence interval
estimates for the mean and the proportion
 How to determine the sample size necessary
to develop a confidence interval for the mean
or proportion
 How to use confidence interval estimates in
auditing
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.
Chap 8-2
Chapter Outline
Confidence Intervals for the Population
Mean, μ
 when Population Standard Deviation σ is
Known
 when Population Standard Deviation σ is
Unknown
2. Confidence Intervals for the Population
Proportion, π
3. Determining the Required Sample Size
1.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.
Chap 8-3
Point Estimates
 A point estimate is a single number. For the
population mean (and population standard
deviation), a point estimate is the sample mean
(and sample standard deviation).
 A confidence interval provides additional
information about variability
Lower Confidence
Limit
Point Estimate
Upper Confidence
Limit
Width of
confidence interval
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.
Chap 8-4
Confidence Interval Estimates
 A confidence interval gives a range estimate of
values:
 Takes into consideration variation in sample
statistics from sample to sample
 Based on all the observations from 1 sample
 Gives information about closeness to unknown
population parameters
 Stated in terms of level of confidence
 Ex. 95% confidence, 99% confidence
 Can never be 100% confident
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.
Chap 8-5
Confidence Interval Estimates
 The general formula for all
confidence intervals is:
Point Estimate ± (Critical Value) (Standard Error)
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.
Chap 8-6
Confidence Level
 Confidence Level
 Confidence in which the interval will contain the
unknown population parameter
 A percentage (less than 100%)
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.
Chap 8-7
Confidence Level
 Suppose confidence level = 95%
 Also written (1 - ) = .95
 A relative frequency interpretation:
 In the long run, 95% of all the confidence
intervals that can be constructed will contain
the unknown true parameter
 A specific interval either will contain or will
not contain the true parameter
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.
Chap 8-8
Confidence Interval for μ
(σ Known)
Assumptions
 Population standard deviation σ is known
 Population is normally distributed
 If population is not normal, use large sample
Confidence interval estimate:
σ
xz
n
(where Z is the standardized normal distribution critical value
for a probability of α/2 in each tail)
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.
Chap 8-9
Finding the Critical Value, Z
Consider a 95% confidence interval:
1   .95
α
 .025
2
Z units:
X units:
α
 .025
2
Z= -1.96
Lower
Confidence
Limit
0
Point Estimate
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.
Z= 1.96
Upper
Confidence
Limit
Chap 8-10
Finding the Critical Value, Z
Commonly used confidence levels are 90%, 95%,
and 99%
Confidence
Level
Confidence
Coefficient
Z value
80%
90%
95%
98%
99%
99.8%
99.9%
.80
.90
.95
.98
.99
.998
.999
1.28
1.645
1.96
2.33
2.58
3.08
3.27
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.
Chap 8-11
Intervals and Level of
Confidence
Sampling Distribution
of the Mean
/2
Intervals
extend from
σ
xz
n
1 
x
μx  μ
x1
x2
to
σ
xz
n
/2
Confidence Intervals
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.
(1-) x 100%
of intervals
constructed
contain μ;
() x 100% do
not.
Chap 8-12
Confidence Interval for μ
(σ Known) Example
 A sample of 11 circuits from a large normal
population has a mean resistance of 2.20
ohms. We know from past testing that the
population standard deviation is .35 ohms.
 Determine a 95% confidence interval for the
true mean resistance of the population.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.
Chap 8-13
Confidence Interval for μ
(σ Known) Example
σ
x z
n
 2.20  1.96 (.35/ 11)
 2.20  .2068
(1.9932 , 2.4068)
 We are 95% confident that the true mean resistance is between
1.9932 and 2.4068 ohms
 Although the true mean may or may not be in this interval, 95%
of intervals formed in this manner will contain the true mean
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.
Chap 8-14
Confidence Interval for μ
(σ Unknown)
 If the population standard deviation σ is
unknown, we can substitute the sample
standard deviation, S
 This introduces extra uncertainty, since S is
variable from sample to sample
 So we use the t distribution instead of the
normal distribution
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.
Chap 8-15
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.
Chap 8-16
Small Samples
Assumptions
If 1) n < 30
2) The sample is a simple random sample.
3) The sample is from a normally
distributed population.
Case 1 ( is known): Largely unrealistic; Use z-scores
Case 2 ( is unknown): Use Student t distribution
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.
Chap 8-17
Student t Distribution
If the distribution of a population is essentially
normal, then the distribution of
t =
x-µ
s
n
 is essentially a Student t Distribution for all
samples of size n.
 is used to find critical values denoted by
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.
t/ 2
Chap 8-18
Definition
Degrees of Freedom (df )
corresponds to the number of sample values that
can vary after certain restrictions have imposed on
all data values
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.
Chap 8-19
Degrees of Freedom (df )
corresponds to the number of sample values that
can vary after certain restrictions have imposed on
all data values
df = n - 1
in this section
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.
Chap 8-20
Degrees
of
freedom
.005
(one tail)
.01
(two tails)
.01
(one tail)
.02
(two tails)
.025
(one tail)
.05
(two tails)
1
63.657
31.821
12.706
2
9.925
6.965
4.303
3
5.841
4.541
3.182
4
4.604
3.747
2.776
5
4.032
3.365
2.571
6
3.707
3.143
2.447
7
3.500
2.998
2.365
8
3.355
2.896
2.306
9
3.250
2.821
2.262
10
3.169
2.764
2.228
11
3.106
2.718
2.201
12
3.054
2.681
2.179
13
3.012
2.650
2.160
14
2.977
2.625
2.145
15
2.947
2.602
2.132
16
2.921
2.584
2.120
17
2.898
2.567
2.110
18
2.878
2.552
2.101
19
2.861
2.540
2.093
20
2.845
2.528
2.086
21
2.831
2.518
2.080
22
2.819
2.508
2.074
23
2.807
2.500
2.069
24
2.797
2.492
2.064
25
2.787
2.485
2.060
26
2.779
2.479
2.056
27
2.771
2.473
2.052
28
2.763
2.467
2.048
29
2.756
2.462
2.045
Large
(z) for Managers Using
Statistics
Microsoft
Excel,
5e
©
2008
Pearson
Prentice-Hall,
Inc.
2.575
2.327
1.960
.05
(one tail)
.10
(two tails)
.10
(one tail)
.20
(two tails)
.25
(one tail)
.50
(two tails)
6.314
2.920
2.353
2.132
2.015
1.943
1.895
1.860
1.833
1.812
1.796
1.782
1.771
1.761
1.753
1.746
1.740
1.734
1.729
1.725
1.721
1.717
1.714
1.711
1.708
1.706
1.703
1.701
1.699
1.645
3.078
1.886
1.638
1.533
1.476
1.440
1.415
1.397
1.383
1.372
1.363
1.356
1.350
1.345
1.341
1.337
1.333
1.330
1.328
1.325
1.323
1.321
1.320
1.318
1.316
1.315
1.314
1.313
1.311
1.282
1.000
.816
.765
.741
.727
.718
.711
.706
.703
.700
.697
.696
.694
.692
.691
.690
.689
.688
.688
.687
.686
.686
.685
.685
.684
.684
.684
.683
.683
Chap
.6758-21
Given a variable that has a t-distribution with the specified
degrees of freedom, what percentage of the time will it be in
the indicated region?
a. 10 df, between -1.81 and 1.81.
90%
b. 10 df, between -2.23 and 2.23. 95%
c. 24 df, between -2.06 and 2.06. 95 %
d. 24 df, between -2.80 and 2.80. 99 %
e. 24 df, outside the interval from -2.80 and 2.80.
1%
f. 24 df, to the right of 2.80. .5 %
g. 10 df, to the left of -1.81. 5 %
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.
Chap 8-22
What are the appropriate t critical values for each of the
confidence intervals?
a. 95 % confidence, n = 17
b. 90 % confidence, n = 12
c. 99 % confidence, n = 14
d. 90 % confidence, n = 25
e. 90 % confidence, n = 13
f. 95 % confidence, n = 10
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.
2.12
1.80
3.01
1.71
1.78
2.262
Chap 8-23
Important Properties of the Student t
Distribution
1. The Student t distribution is different for different sample sizes (see Figure 6-5 for the cases n
= 3 and n = 12).
2. The Student t distribution has the same general symmetric bell shape as the normal
distribution but it reflects the greater variability (with wider distributions) that is expected
with small samples.
3. The Student t distribution has a mean of t = 0 (just as the standard normal distribution has a
mean of z = 0).
4. The standard deviation of the Student t distribution varies with the sample size and is greater
than 1 (unlike the standard normal distribution, which has a
= 1).
5. As the sample size n gets larger, the Student t distribution gets closer to the normal
distribution. For values of n > 30, the differences are so small that we can use the critical z
values instead of developing a much larger table of critical t values. (The values in the
bottom row of Table A-3 are equal to the corresponding critical z values from the standard
normal distribution.)
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.
Chap 8-24
Student t Distributions for
n = 3 and n = 12
Student t
Standard
normal
distribution
distribution
with n = 12
Student t
distribution
with n = 3
0
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.
Chap 8-25
Example:
A study of 12 Dodge Vipers involved in collisions resulted
in repairs averaging $26,227 and a standard deviation of $15,873. Find
the 95% interval estimate of , the mean repair cost for all Dodge Vipers
involved in collisions. (The 12 cars’ distribution appears to be bellshaped.)
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.
Chap 8-26
Example:
A study of 12 Dodge Vipers involved in collisions resulted
in repairs averaging $26,227 and a standard deviation of $15,873. Find
the 95% interval estimate of , the mean repair cost for all Dodge Vipers
involved in collisions. (The 12 cars’ distribution appears to be bellshaped.)
x = 26,227
s = 15,873
 = 0.05
/2 = 0.025
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.
Chap 8-27
Table A-3 t Distribution
Degrees
of
freedom
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
Large (z)
.005
(one tail)
.01
(two tails)
63.657
9.925
5.841
4.604
4.032
3.707
3.500
3.355
3.250
3.169
3.106
3.054
3.012
2.977
2.947
2.921
2.898
2.878
2.861
2.845
2.831
2.819
2.807
2.797
2.787
2.779
2.771
2.763
2.756
2.575
.01
(one tail)
.02
(two tails)
31.821
6.965
4.541
3.747
3.365
3.143
2.998
2.896
2.821
2.764
2.718
2.681
2.650
2.625
2.602
2.584
2.567
2.552
2.540
2.528
2.518
2.508
2.500
2.492
2.485
2.479
2.473
2.467
2.462
2.327
.025
(one tail)
.05
(two tails)
12.706
4.303
3.182
2.776
2.571
2.447
2.365
2.306
2.262
2.228
2.201
2.179
2.160
2.145
2.132
2.120
2.110
2.101
2.093
2.086
2.080
2.074
2.069
2.064
2.060
2.056
2.052
2.048
2.045
1.960
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.
.05
(one tail)
.10
(two tails)
.10
(one tail)
.20
(two tails)
.25
(one tail)
.50
(two tails)
6.314
2.920
2.353
2.132
2.015
1.943
1.895
1.860
1.833
1.812
1.796
1.782
1.771
1.761
1.753
1.746
1.740
1.734
1.729
1.725
1.721
1.717
1.714
1.711
1.708
1.706
1.703
1.701
1.699
1.645
3.078
1.886
1.638
1.533
1.476
1.440
1.415
1.397
1.383
1.372
1.363
1.356
1.350
1.345
1.341
1.337
1.333
1.330
1.328
1.325
1.323
1.321
1.320
1.318
1.316
1.315
1.314
1.313
1.311
1.282
1.000
.816
.765
.741
.727
.718
.711
.706
.703
.700
.697
.696
.694
.692
.691
.690
.689
.688
.688
.687
.686
.686
.685
.685
.684
.684
.684
.683
.683
.675
Chap 8-28
Example:
A study of 12 Dodge Vipers involved in collisions resulted
in repairs averaging $26,227 and a standard deviation of $15,873. Find
the 95% interval estimate of , the mean repair cost for all Dodge Vipers
involved in collisions. (The 12 cars’ distribution appears to be bellshaped.)
x = 26,227
s = 15,873
 = 0.05
/2 = 0.025
t/2 = 2.201
ME = t2 s = (2.201)(15,873) = 10,085.29
n
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.
12
Chap 8-29
Example:
A study of 12 Dodge Vipers involved in collisions resulted
in repairs averaging $26,227 and a standard deviation of $15,873. Find
the 95% interval estimate of , the mean repair cost for all Dodge Vipers
involved in collisions. (The 12 cars’ distribution appears to be bellshaped.)
x = 26,227
s = 15,873
 = 0.05
/2 = 0.025
t/2 = 2.201
ME = t2 s = (2.201)(15,873) = 10,085.3
12
n
x - ME
<µ<
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.
x +ME
Chap 8-30
Example:
A study of 12 Dodge Vipers involved in collisions resulted
in repairs averaging $26,227 and a standard deviation of $15,873. Find
the 95% interval estimate of , the mean repair cost for all Dodge Vipers
involved in collisions. (The 12 cars’ distribution appears to be bellshaped.)
x = 26,227
s = 15,873
 = 0.05
/2 = 0.025
t/2 = 2.201
E = t2 s = (2.201)(15,873) = 10,085.3
n
x -E
<µ<
26,227 - 10,085.3 < µ <
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.
12
x +E
26,227 + 10,085.3
Chap 8-31
Example:
A study of 12 Dodge Vipers involved in collisions resulted
in repairs averaging $26,227 and a standard deviation of $15,873. Find
the 95% interval estimate of , the mean repair cost for all Dodge Vipers
involved in collisions. (The 12 cars’ distribution appears to be bellshaped.)
x = 26,227
s = 15,873
 = 0.05
/2 = 0.025
t/2 = 2.201
ME = t2 s = (2.201)(15,873) = 10,085.3
n
12
x - ME
< µ < x + ME
26,227 - 10,085.3 < µ < 26,227 + 10,085.3
$16,141.7 < µ < $36,312.3
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.
Chap 8-32
Example:
A study of 12 Dodge Vipers involved in collisions resulted
in repairs averaging $26,227 and a standard deviation of $15,873. Find
the 95% interval estimate of , the mean repair cost for all Dodge Vipers
involved in collisions. (The 12 cars’ distribution appears to be bellshaped.)
x = 26,227
s = 15,873
 = 0.05
/2 = 0.025
t/2 = 2.201
E = t2 s = (2.201)(15,873) = 10,085.3
n
x -E
<µ<
26,227 - 10,085.3 < µ <
$16,141.7 < µ <
12
x +E
26,227 + 10,085.3
$36,312.3
We are 95% confident that this interval contains the
average cost of repairing a Dodge Viper.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.
Chap 8-33
Confidence Interval for μ
(σ Unknown)
Assumptions
 Population standard deviation is unknown
 Population is normally distributed
 If population is not normal, use large sample
Use Student’s t Distribution
Confidence Interval Estimate:
S
x  t n-1
n
(where t is the critical value of the t distribution with n-1
d.f. and an area of α/2 in each tail)
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.
Chap 8-34
Student’s t Distribution
 The t value depends on degrees of freedom (d.f.)
 Number of observations that are free to vary after
sample mean has been calculated
d.f. = n - 1
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.
Chap 8-35
Degrees of Freedom
Idea: Number of observations that are free to vary
after sample mean has been calculated
Example: Suppose the mean of 3 numbers is 8.0
 Let X1 = 7
 Let X2 = 8
 What is X3?
If the mean of these three
values is 8.0,
then X3 must be 9
(i.e., X3 is not free to vary)
Here, n = 3, so degrees of freedom = n – 1 = 3 – 1 = 2
(2 values can be any numbers, but the third is not free to vary for a
given mean)
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.
Chap 8-36
Student’s t Distribution
Note: t
Z as n increases
Standard
Normal
(t with df = ∞)
t (df = 13)
t-distributions are bell-shaped
and symmetric, but have
‘fatter’ tails than the normal
t (df = 5)
0
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.
t
Chap 8-37
Student’s t Table
Upper Tail Area
df
.25
.10
.05
1 1.000 3.078 6.314
Let: n = 3
df = n - 1 = 2
 = .10
/2 =.05
2 0.817 1.886 2.920
/2 = .05
3 0.765 1.638 2.353
The body of the table contains t
values, not probabilities
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.
0
2.920
t
Chap 8-38
Confidence Interval for μ
(σ Unknown) Example
A random sample of n = 25 has X = 50 and S = 8. Form a 95%
confidence interval for μ
 d.f. = n – 1 = 24, so
 The confidence interval is
x  tα/ 2 , n-1
S
8
 50  (2.0639)
n
25
(46.698 , 53.302)
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.
Chap 8-39
Confidence Intervals for the
Population Proportion, π
 An interval estimate for the population
proportion ( π ) can be calculated by adding
an allowance for uncertainty to the sample
proportion ( p )
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.
Chap 8-40
Confidence Intervals for the
Population Proportion, π
Recall that the distribution of the sample proportion is
approximately normal if the sample size is large, with
standard deviation
σp 
 (1   )
n
We will estimate this with sample data:
p(1 p)
n
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.
Chap 8-41
Confidence Intervals for the
Population Proportion, π
Upper and lower confidence limits for the population proportion
are calculated with the formula
p(1 p)
pz
n
where
 z is the standardized normal value for the level of
confidence desired
 p is the sample proportion
 n is the sample size
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.
Chap 8-42
Confidence Intervals for the
Population Proportion, Example
A random sample of 100 people shows that 25 have opened
IRA’s this year. Form a 95% confidence interval for the true
proportion of the population who have opened IRA’s.
p  z p(1  p)/n
 25/100  1.96 .25(.75)/1 00
 .25  1.96 (.0433)
(0.1651 , 0.3349)
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.
Chap 8-43
Confidence Intervals for the
Population Proportion, Example
 We are 95% confident that the true
percentage of left-handers in the population
is between 16.51% and 33.49%. Although
the interval from .1651 to .3349 may or may
not contain the true proportion, 95% of
intervals formed from samples of size 100 in
this manner will contain the true proportion.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.
Chap 8-44
Determining Sample Size
 The required sample size can be found to
reach a desired margin of error (e) with a
specified level of confidence (1 - )
 The margin of error is also called sampling
error
 the amount of imprecision in the estimate of
the population parameter
 the amount added and subtracted to the point
estimate to form the confidence interval
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.
Chap 8-45
Determining Sample Size
 To determine the required sample size for the
mean, you must know:
 The desired level of confidence (1 - ), which
determines the critical Z value
 The acceptable sampling error (margin of
error), e
 The standard deviation, σ
σ
ez
n
z σ
n 2
e
2
Now solve
for n to get
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.
2
Chap 8-46
Determining Sample Size
If  = 45, what sample size is needed to estimate
the mean within ± 5 with 90% confidence?
z 2 σ 2 (1.645) 2 (45) 2
n 2 
 219.19
2
e
5
So the required sample size is n = 220
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.
Chap 8-47
Determining Sample Size
 If unknown, σ can be estimated when using
the required sample size formula
 Use a value for σ that is expected to be at least
as large as the true σ
 Select a pilot sample and estimate σ with the
sample standard deviation, S
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.
Chap 8-48
Determining Sample Size
To determine the required sample size for the proportion, you
must know:
 The desired level of confidence (1 - ), which
determines the critical Z value
 The acceptable sampling error (margin of error), e
 The true proportion of “successes”, π
 π can be estimated with a pilot sample, if necessary (or
conservatively use π = .50)
eZ
 (1   )
n
Now solve
for n to get
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.
Z  (1   )
n
e2
2
Chap 8-49
Determining Sample Size
How large a sample would be necessary to
estimate the true proportion defective in a large
population within ±3%, with 95% confidence?
 (Assume a pilot sample yields p = .12)
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.
Chap 8-50
Determining Sample Size
Solution:
For 95% confidence, use Z = 1.96
e = .03
p = .12, so use this to estimate π
Z 2  (1   ) (1.96) 2 (.12)(1  .12)
n

 450.74
2
2
e
(.03)
So use n = 451
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.
Chap 8-51
Example: We want to determine, with a margin of error of
four percentage points, the current percentage of U.S.
households using e-mail. Assuming that we want 90%
confidence in our results, how many households must we
survey? A 1997 study indicates 16.9% of U.S. households
used e-mail.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.
Chap 8-52
Example: We want to determine, with a margin of error of four
percentage points, the current percentage of U.S. households using email. Assuming that we want 90% confidence in our results, how
many households must we survey? A 1997 study indicates 16.9% of
U.S. households used e-mail.
n = [z/2 ]2 p (1-p)
ME2
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.
Chap 8-53
Example: We want to determine, with a margin of error
of four percentage points, the current percentage of U.S.
households using e-mail. Assuming that we want 90%
confidence in our results, how many households must we
survey? A 1997 study indicates 16.9% of U.S. households
used e-mail.
n = [z/2 ]2 p(1-p)
ME2
= [1.645]2 (0.169)(0.831)
0.042
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.
Chap 8-54
Example: We want to determine, with a margin of error
of four percentage points, the current percentage of U.S.
households using e-mail. Assuming that we want 90%
confidence in our results, how many households must we
survey? A 1997 study indicates 16.9% of U.S. households
used e-mail.
n = [z/2 ]2 p (1-p)
ME2
= [1.645]2 (0.169)(0.831)
0.042
= 237.51965
= 238 households
To be 90% confident that our
sample percentage is within four
percentage points of the true
percentage for all households, we
should randomly select and survey
238 households.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.
Chap 8-55
Example: We want to determine, with a margin of error of
four percentage points, the current percentage of U.S.
households using e-mail. Assuming that we want 90%
confidence in our results, how many households must we
survey? There is no prior information suggesting a possible
value for the sample percentage.
n = [z/2 ]2 (0.25)
ME2
= (1.645)2 (0.25)
0.042
= 422.81641
= 423
households
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.
Chap 8-56
Applications in Auditing
 Six advantages of statistical sampling in
auditing
 Sample result is objective and defensible
 Based on demonstrable statistical principles
 Provides sample size estimation in advance on
an objective basis
 Provides an estimate of the sampling error
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.
Chap 8-57
Applications in Auditing
 Can provide more accurate conclusions on the
population
 Examination of the population can be time
consuming and subject to more nonsampling error
 Samples can be combined and evaluated by
different auditors
 Samples are based on scientific approach
 Samples can be treated as if they have been done
by a single auditor
 Objective evaluation of the results is possible
 Based on known sampling error
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.
Chap 8-58
Population Total Amount
 Point estimate:
Population total  NX
 Confidence interval estimate:
S
N x  N (tn 1 )
n
N n
N 1
(This is sampling without replacement, so use the finite population correction
in the confidence interval formula)
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.
Chap 8-59
Population Total Amount
 A firm has a population of 1000 accounts and wishes
to estimate the total population value.
 A sample of 80 accounts is selected with average
balance of $87.6 and standard deviation of $22.3.
 Find the 95% confidence interval estimate of the
total balance.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.
Chap 8-60
Population Total Amount
N  1000, n  80, X  87.6, S  22.3
S
N X  N (t n 1 )
n
Nn
N 1
22.3 1000  80
 (1000)(87. 6)  (1000)(1.9 905)
80 1000  1
 87,600  4,762.48
The 95% confidence interval for the population total balance
is $82,837.52 to $92,362.48
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.
Chap 8-61
Confidence Interval for
Total Difference
 Point estimate:
Total Difference  ND
 Where the mean difference, D , is:
n
D
D
i 1
i
n
where Di  audited value - original value
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.
Chap 8-62
Confidence Interval for
Total Difference
Confidence interval estimate:
SD
ND  N( t n1 )
n
where
Nn
N 1
n
SD 
2
(
D

D
)
 i
i1
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.
n 1
Chap 8-63
One Sided Confidence
Intervals
Application: find the upper bound for the proportion of items that
do not conform with internal controls
p(1  p) N  n
Upper bound  p  Z
n
N 1
where
 Z is the standardized normal value for the level of
confidence desired
 p is the sample proportion of items that do not conform
 n is the sample size
 N is the population size
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.
Chap 8-64
Ethical Issues
 A confidence interval (reflecting sampling
error) should always be reported along with a
point estimate
 The level of confidence should always be
reported
 The sample size should be reported
 An interpretation of the confidence interval
estimate should also be provided
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.
Chap 8-65
Chapter Summary
In this chapter, we have
 Introduced the concept of confidence intervals
 Discussed point estimates
 Developed confidence interval estimates
 Created confidence interval estimates for the mean
(σ known)
 Determined confidence interval estimates for the
mean (σ unknown)
 Created confidence interval estimates for the
proportion
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.
Chap 8-66
Chapter Summary
In this chapter, we have
 Determined required sample size for mean and
proportion settings
 Developed applications of confidence interval
estimation in auditing
 Confidence interval estimation for population total
 Confidence interval estimation for total difference in
the population
 One sided confidence intervals
 Addressed confidence interval estimation and ethical
issues
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.
Chap 8-67