Download Chapter 8 - The WA Franke College of Business

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Foundations of statistics wikipedia , lookup

Degrees of freedom (statistics) wikipedia , lookup

History of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

German tank problem wikipedia , lookup

Transcript
Chapter 8
Point and Interval Estimators
Up to this point we have been studying probability theory. We have not looked at
statistics at all. In probability theory we asked the following sort of question: ``Suppose
we have a normal distribution with   100 and x . What is the probability that a
sample of n  100 will produce an x in the interval [99,101]?''
In statistics we ask this kind of question. “Suppose we took a sample of n  100 and
found that x  99.5 . What can we say about  ?” In one sense statistics is probability
theory stood on its head. What is considered as given in probability theory is what we
don't know in statistics. What we consider as known in statistics is the question asked in
probability theory.
8.1 Point Estimators
One of the tasks in statistics is to get estimates of population parameters such as  or  .
An estimator is a formula for producing an estimate. The estimate will serve as our best
guess for the value of the population parameter we can find using the sample data.
If we have the following data:
x = 1.0, 2.0, 3.0
1 n
 xi will be the estimator. The value
n i 1
of the mean, x  2 , is called the estimate. Note that the formula is the estimator and the
value that the formula produces for a given set of data is called the estimate. The best
point estimators for the population parameters we have considered thus far are shown in
Table 8-1
then the sample mean x 
Population parameter

Best point estimator
1 n
X   Xi
n i 1

2
1 n
s
Xi  X 


n  1 i 1
p
X
pˆ 
n
Table 8-1. Best point estimators for some population parameters.
Recall from your mathematics classes that a point on a line can represent a number. So
anything that can produce a number will suffice as a point estimator. Generally,
estimators will be formulas, but need not be. Some formulas that will produce a number
using sample data are
Point estimators of 
Sample mean
formula
n
x   xi
i 1
Geometric mean
GM  n x1 x2 x3
Funny formula one
xn
n
ff 1   cos  xi 
i 1
Funny formula two
n
ff 2   log  xi 
i 1
Max’s estimator
5
Table 8.2 Point estimators of the population mean.
Table 8.1 shows some potential point estimators of the population mean. There is an
infinity of possible point estimators of the population mean but we only show 5 of them
here. Note that Max’s estimator has some distinct advantages – it requires no data and
hence it has zero sampling cost and it can be computed very quickly. Is there a reason
then, save for professional jealously, that other statisticians prefer the sample mean to
Max’s estimator?
The answer involves the word best. By best we mean the following.
----------------------[---  ---]------------------------
The drawing above show the population mean designated as a point on a line (the black
dot). Construct a small interval [] about this dot. Take a sample and compute the point
estimators for every formula. More x ’s will be in this interval than any other point
estimator (well if the population mean just happens to be   5 Max’s estimator will win
every time.). Hence the term best, it just means that, on average, it will be better than any
competing estimator.
Finally note that best does not necessarily mean good. Any particular value for the
sample mean might produce a perfectly rotten estimate of the mean. Interval estimators,
discussed next, will help determine if the estimate is a good one.
Example 8.1 A new manufacturing process has been developed to produce artificial
diamonds. A sample of n  10 diamonds are taken from the process and weighed. The
results of this sample are x  0.5 carets and s 2  0.1 carets. What is the best estimate of
the true average weight of diamonds produced by the process?
The true average weight of the process is designated by  , the average of the weights of
all diamonds that will ever be produced. Obviously, it would be possible to compute this
value by weighing every diamond. We can only make a guess of the value of the
population mean based on the sample results. The best guess we can make using the
sample data is the sample mean, x  0.5 carets.
8.2 Interval Estimators when  is known
An estimator is a formula that produces a single value called an estimate. We know that
the best point estimator for a population mean is a sample mean. The actual value we get
from a sample is called an estimate. What we would like is some way of trying to
determine if a particular estimate is a good one. The only way of being sure is to actually
compare our estimate with the population mean. But since the reason we are taking the
sample in the first place is to try to get a guess for the unknown value of the population
mean we can never really be sure that a particular guess is a good one. However, we can
get some pretty good hints.
First let’s look at a probability problem. Consider taking samples of size n=100 from a
normal distribution with   100 and  =51.02 . Note the CLT holds (why?). Yes, the
value for the standard deviation looks funny, but will be useful in what follows. Now
construct a symmetric interval about the population mean, such that 95% of all sample
means of size n=100 will be in this interval. Consider the formula for the Z—score
Z=
 X-μ  which can be rearranged as X  
x
σx
X
 Z X .
We need to construct an interval  X L , X L  such that there is a 95% probability that a
sample mean will be in this interval. Such an interval can be computed by choosing the
proper values of Z. The interval we need from the Z distribution is one that contains 95%
of the area symmetric about the mean. That interval is  1.96,1.96 . If we use these
values we get X L   X  1.96 X and X U   X  1.96 X . So
 X L , X U     X  1.96 X ,  X  1.96 X 
Now for a sample of size 100
X 

n

51.02
 5.102
100
so
 X L , X U     X  1.96 X ,  X  1.96 X 
 X L , X U   100  1.96  5.102  ,100  1.96  5.102  
 X L , X U   90,110
So there is a 95% probability that a sample mean will be in the interval 90,110 . This is
why the value of 51.02 for the standard deviation was used earlier – so the resulting
interval would have values that are easy to work with.
Import observation: Suppose we take 100 samples from a normal distribution. Then
approximately 95 of the 100 samples will have means in the interval
  X 1.96 X ,  X  1.96 X   90,100 .
Now consider Table 8.3. The two right most columns include the interval 90,110 that
will contain 95% of the sample means. The column labeled X will contain a sample
mean (we will pretend to take sample means here). The next two columns contain the
interval  X  1.96 X , X  1.96 X    X  10, X  10  . The last two columns are labeled
A and B. Let these statements be the following questions:
A: Is X in   X 1.96 X ,  X  1.96 X  ?
B: Is  X in  X-1.96 X , X  1.96 X  ?
The first sample has a mean X  101 . This mean lies in 90,110 . The same size
interval about X  101 is 91,111 . Note this interval contains  X . So the answer to A is
Y and the answer to B is Y.
 X  1.96 x
 X  1.96 x
90
90
90
90
90
90
110
110
110
110
110
110
X
101
104
109
109.99
110.01
113
X  1.96 X
91
94
99
99.99
101.01
93
X  1.96 X
111
114
119
119.99
120.01
123
A
B
Y
Y
Y
Y
N
N
Y
Y
Y
Y
N
N
Table 8.3
Important observations:
1.
2.
3.
4.
Every time there is a Y in column A there will also be a Y in column B.
95% of the time there will be a Y in column A.
Thus 95% of the time there will be a Y in column B.
So if we take a sample mean and construct an interval  X-1.96 X , X  1.96 X 
then there is a 95% probability that that interval will contain the population mean.
5. To rephrase 4., if you take a sample, compute the sample mean , X , and construct
an interval  X-1.96 X , X  1.96 X  about that sample mean and then make a
bet that this interval contains  , then you will win 95% of your bets.
Example 8.2. Suppose that household incomes in Flagstaff are normally distributed with
standard deviation   2000 . A sample of n=1000 household are interviewed and data
are collected about the income of the household. The results of the sample are that the
average income of the households in the sample is X  33,011 . Construct an interval
such that you can be 95% certain that the interval will contain the average income for all
Flagstaff households (the latter is the population mean). From the above discussion, we
know that if we find an interval
 X-1.96 X , X  1.96 X 
it will contain  . So we have
X 

n

2000
 63.25
1000
 X-1.96 X , X  1.96 X   33011  1.96  63.25  ,33011  1.96  63.25  
 X-1.96 X , X  1.96 X   32887.03,33134.97 
Example 8.3. The same as 8.2 except that the sample size is n=10.
X 

n

2000
 632.46
10
 X-1.96 X , X  1.96 X   33011  1.96  632.46  ,33011  1.96  632.46  
 X-1.96 X , X  1.96 X   31771.38,34250.62
Compare the two results. In Example 8.2 we were 95% confident that the population
mean lies in an interval of width 123.97 while the width of the interval in 8.3 is
1239.62 . We should be much better about the interval in 8.2 than the one in 8.3.
Example 8.4 To belabor the point suppose the sample had been of size n=10000. In that
case

2000
X 

 20
n
10000
 X-1.96 X , X  1.96 X   33011  1.96  20  ,33011  1.96  20  
 X-1.96 X , X  1.96 X   32971.8,33050.2
In this case the interval is 39.20 . In this case we could say that we are 95% confident
we know where the population mean is within about plus or minus 39.20 dollars. We can
make the interval as small as we want by making the sample size bigger. (Actually if the
sample size gets close to the population size we would have to modify the formula for
 X . We will ignore that issue here).
A 1   100% confidence interval (CI) for the population mean  is given by
X  Z / 2 x
where  determines the level of confidence and hence the value of Z to use. You would
choose a value of Z so the  / 2 of the area of the Z distribution lies in each tail.
Suppose you want a 90% CI for the population mean. So 1   100%  90% so  =0.1
and we want values from the Z distribution with  / 2  0.1/ 2  0.05 or 5% of the area in
each tail of the Z—distribution as shown in Figure 8.1. For the 90% CI the Z values are
1.65 .
Figure 8.1 A 90% CL for 
Excel has a command that computes confidence intervals in the form  Z / 2 X . The
command is
confidence( ,  , n)
If we use this for Examples 8.2—8.4 above we find
For Example 8.2
For Example 8.3
For Example 8.4
123.9588 =CONFIDENCE(0.05,2000,1000)
1239.588 =CONFIDENCE(0.05,2000,10)
39.19922 =CONFIDENCE(0.05,2000,10000)
Example 8.5 A sample of n=300 is taken of Flagstaff household incomes with the
resulting sample mean of $28,750. It is known that the standard deviation of all Flagstaff
household incomes $1000. Find a 99% CI for the average of all Flagstaff household
incomes. Note, because the distribution is continuous we will assume and the sample
size is relatively large (>30) we will assume the CLT holds for this problem.
Z / 2

1000
 57.74
n
300
 Z 0.01/ 2  Z 0.005  2.58
X 

X  Z / 2 X  28, 750  2.58  57.74   28, 750  148.96
We are 99% confident that the average household income for Flagstaff is in the interval
28601.04, 28898.96 .
Example 8.6 Suppose that we are interested in estimating the average gasoline usage of a
certain brand of automobile. A sample of n=150 cars of that brand are driven and the
average gasoline usage of the sample cars is X  32.5 mpg. Suppose that we know the
population standard deviation is 3 mpg. What is the best point estimate for the average
mpg for the entire fleet of cars of this brand. Find a 90% CI for the population mean
mpg.
The best point estimate if 32.5 mpg.
For the 90% CI
X 
Z / 2


n
 Z 0.10 / 2
3
 0.245
150
 Z 0.05  1.65
X  Z / 2 X  32.5  1.65  0.245  32.5  0.404
8.3 Confidence Intervals for p̂
The method for finding the CI for a population proportion, p, is the same as finding the
CI for the population mean. The general form is
point estimator  Z / 2  standard deviation of the sampling distribution
A 1   100% CI for p is
pˆ  Z / 2 pˆ
where
ˆ ˆ/n
 pˆ  pq
Note there is a slight change in the formula for the standard deviation of the sampling
distribution here and the one in Chapter 7. There we used
 p  pq / n
and here we use
ˆ ˆ/n
 pˆ  pq
Chapter 7 was a chapter on probability theory. In probability theory we assume we know
the parameter values, so in that chapter we assumed we knew p . This chapter is a
chapter on statistics. We don’t know the parameter values, all we have are point
estimates, which we use as guesses for the parameter value. So we don’t know p here,
we only know p̂ which we will use as our best guess for p . So our test to see if the CLT
will hold here is if
npˆ  5 and
nqˆ  5.
Because we are using guesses here, we might be wrong, but there is no better alternative.
Example 8.7 We have commissioned another election poll. A sample of n=1000 voters
are asked if they prefer candidate A or candidate B. Of these 534 say they prefer A. We
will consider a response for A to be a success. Find a 95% CI for p, the proportion of
voters who prefer A.
X  534
n  1000
X
534

 0.534, qˆ  1- pˆ  0.466
n 1000
ˆ ˆ / n  (0.534)(0.466) /1000  0.0158
 pˆ  pq
pˆ 
  0.05 so Z / 2  Z 0.025  1.96
pˆ  Z / 2 pˆ  0.534  (1.96)(0.0158)  0.534  0.031
0.503, 0.565
Example 8.8 Radio station KFUD claims that 40% of the listeners in its receiving area
listen to its 2:00PM music program. A sample of n=300 radio listeners in this area are
sampled and 88 say they listen to KFUD at 2:00PM. Calculate a 99% CI for p. Does it
seem likely that KFUD’s claim is correct?
X  88
n  300
X
88

 0.293, qˆ  1- pˆ  0.707
n 3000
ˆ ˆ / n  (0.293)(0.707) / 300  0.026
 pˆ  pq
pˆ 
  0.01 so Z / 2  Z 0.005  2.58
pˆ  Z / 2 pˆ  0.293  (2.58)(0.0263)  0.293  0.067
0.226, 0.360
Given this interval it is most unlikely that KFUD’s claim is correct. I would be prepared
to bet fairly big money that this claim is not true.
8.4 Confidence Intervals --  unknown.
Thus far we have assumed that the population standard deviation,  , is known. This is a
pretty silly assumption in practical work. Recall that the formula for the population
variance is
2
1 N
2
   Xi    .
N i 1
If we don’t know the population mean, then it is unlikely that we will know the
population standard deviation. Instead we will replace the population standard deviation
with the best guess we have for it based on the sample data—that is we will use the value
produced by the best point estimator
2
1 n
2
s 
 Xi    .
n  1 i 1
We can’t just replace  with s in the formula, we must also replace Z with a new
distribution, the t—distribution.
A 1   100% confidence interval for  when  is not known
X  t / 2 s X
where s X 
s
n
So why do we do this? The value of s X we get from a particular sample may be either
larger or smaller than the value of  X . When we say, for example, that we are 95%
confident, we really ought to say that we are at least 95% confident that the population
mean lies within the interval. Consider a betting analogy. Suppose that we want to
construct an interval such that if we bet the population mean lies inside the interval, that
we will win at least 95% of the bets. Now if s X   X there would be no problem—the
resulting interval computed using  Zs X would still work. The interval would be bigger
than that calculated using  Z X , but we could still be at least 95% confident that the
population mean would be in the interval. The problem arises when s X   X . In that
case ZsX  Z X , and we would win something less than 95% of our bets. In statistics,
the tendency is to make conservative statements – to err on the side of safety. So what we
want to do is take care of the worse case scenario—the one where we win less than 95%
of the bets. We want something that will cause the interval to get bigger that it would
just using Z. We will replace Z with the t—distribution that is designed just for that
purpose.
df
1
t0.100
3.078
t0.050
6.314
t0.025
12.706
t0.010
t0.005
31.821 63.657
7
1.415
1.895
2.365
2.998
3.499
Table 8.4 A portion of the t—distribution.
Table 8.4 shows a part of the t—distribution. The symbol df stands for degrees of
freedom. The concept of a degree of freedom is difficult and requires a fair degree of
mathematics sophistication. We will not discuss what this concept is here. You must
know, however that
For the t—distribution
df = n-1
That is for the t—distribution the degrees of freedom is equal to the sample size minus
one. The t—distribution values change as the sample size changes.
The subscript indicates the area of the t—distribution that is to the right of the value in
the body of the table. A plot of the t—distribution is shown in Figure 8.2.
If, for example, n  8 so that df  n  1  8  1  7 , then the column labeled t0.100 gives a
value 1.415 where that column intersect the row with df  7 . This means that 10% of
the area under the curve for the t—distribution is to the right of 1.415 and 90% is to the
left. We could refer to this as t0.100  1.415 .
If df  1 and t  t0.010  31.821 , then 10% of the area under the t curve will be to the
right of 31.821 and 90% to the left.
The t—distribution has mean zero and it is symmetric about zero just like the standard
normal. So if we needed to find values of t for which 80% of the distribution lies
between them when df  1 , these values are t  31.821 . If 10% of the area is to the
right of 31.821, then 10% of the area will be to the left of –31.821 and 80% of the area
will be between them.
alpha
t
Figure 8.2 The t—distribution. Note that the area given in the t—table is the area to the
right of a point, not to the left as with the normal distribution.
Remark 1: If n is very large, the t and normal distributions are almost identical.
Remark 2: The t distribution assumes that the sample comes from a normal distribution.
In practice this is often assumed as a matter of convenience. We will use
that assumption in this class, except when we know that the binomial
distribution is being used.
Remark 3: We still need to check to see if the CLT holds. If the distribution the sample
comes from is normal, then the CLT will hold automatically. If it is not
normal and n>30 then we will say that the CLT holds (binomial excepted),
but that we might not be really justified in using the t distribution. We will
use it and hope for the best, however. The good news is that when n gets
very large the CLT will almost surely hold and t will be so close to the
normal that any errors we make will be small.
Example 8.9 A testing facility has a contract to provide an independent evaluation of the
gasoline usage of a particular kind of automobile. Because the test is to not be biased by
any relationship with the car manufacturer, the testing facility must purchase its own cars.
Because this is a very expensive proposition they decide to purchase only 25 cars (n=25).
They determine the sample average gasoline usage for these cars is 31.25 mpg  X  and
the sample standard deviation is 2.65 mpg  s  . Find a 90% CI for the gasoline usage for
these automobiles. Assume that gasoline usage is normally distributed.
X  31.25
n  25
s  2.65
s
2.65
sX 

 0.53
n
25
df  n  1  24
t0.050  1.711 (for a 90% CI we need 5% in each tail)
X  t0.050 s X  31.25  (1.711)(0.53)  31.25  0.91
Excel has a worksheet function for the t—distribution. The description from the Excel
help facility is
Returns the t-value of the Student's t-distribution as a function of the probability and the
degrees of freedom.
Syntax
TINV(probability,degrees_freedom)
Probability is the probability associated with the two-tailed Student's t-distribution.
Degrees_freedom is the number of degrees of freedom to characterize the distribution.
Remarks

If either argument is nonnumeric, TINV returns the #VALUE! error value.

If probability < 0 or if probability > 1, TINV returns the #NUM! error value.

If degrees_freedom is not an integer, it is truncated.

If degrees_freedom < 1, TINV returns the #NUM! error value.

TINV is calculated as TINV = p( t<X ), where X is a random variable that follows
the t-distribution.

A one-tailed t-value can be returned by replacing probability with 2*probability.
For a probability of 0.05 and degrees of freedom of 10, the two-tailed value is
calculated with TINV(0.05,10), which returns 2.28139. The one-tailed value for
the same probability and degrees of freedom can be calculated with
TINV(2*0.05,10), which returns 1.812462.
Note In some tables, probability is described as (1-p).
TINV uses an iterative technique for calculating the function. Given a probability value,
TINV iterates until the result is accurate to within ± 3x10^-7. If TINV does not converge
after 100 iterations, the function returns the #N/A error value.
Example
TINV(0.054645,60)
equals 1.96
Here is the bad news. Excel the area in both tails. Not just the rightmost tail. For the
problem we just worked we wanted a 90% CI, so we should have 10% in the tails (a two
tailed value).
So we can find the t value that will give us 10% of the area in the tails by the command
1.711
=TINV(0.100,24)
The Excel solution for Example 8.9 is shown below.
X bar
s
s xbar
df
t-value
interval width
CI
Upper limit
Lower limit
31.25
2.65
0.53 =2.65/SQRT(25)
24
1.711 =TINV(0.1,24)
0.91 =1.711*0.53
32.16 =31.25+0.91
30.34 =31.25-0.91
Excel solution for Example 8.9
Problems
8.1
The Flagstaff Chamber of Commerce wants to attract a new retail business to
Flagstaff. They wish to impress on the retail establishment that Flagstaff is a prosperous
city. They conduct a survey of Flagstaff homes to get an estimate of household income.
Suppose that it is known that the standard deviation for the population of household
incomes in Flagstaff is $5,000. The sample is conducted for n=100 Flagstaff homes and
gives a sample mean of $21,555. Find a 90% confidence interval for mean Flagstaff
household income. Should the CLT hold? Compute by hand and also use Excel
Given: X  21555,   5000 and since we know the population standard deviation we
can use Z. Because the distribution is continuous and n  30 we will assume the CLT
holds
X  Z / 2 x
X  21555
x 
Z / 2


n
 Z 0.05
5000 5000

 50
10
100
 1.65
X  Z / 2 x  21555  1.65(500)  21555  825
 20730, 22380
X bar
sigma
n
sigma x bar
confidence
UCL
LCL
21555
5000
100
500
822.4265002
22377.4
20732.6
=5000/SQRT(100)
=CONFIDENCE(0.1,5000,100)
=21555 + 822.4
=21555 - 822.4
8.2 Work problem 8.1 assuming that the population standard deviation is not known, but
that the sample standard deviation is s=4,500. Assume that incomes are normally
distributed. Then solve using Excel
Because the data comes from a normal distribution the CLT holds and also use of the t
distribution is valid as well. We have to use t because we don’t know the population
standard deviation.
X  t / 2 sx
X  21555
s
4500 4500


 450
10
n
100
 t 0.05  1.671 (for 99 df)
sx 
t / 2
X  t / 2 sx  21555  1.671(450)  21555  752
 20803, 22307
xbar
n
s
s xbar
T
UCL
LCL
21555
100
4500
450
1.660391717
22302
20808
= 4500/SQRT(100)
= TINV(0.1,99)
= 21555 + 1.66*(450)
= 21555 - 1.66*(450)
Note that TINV(alpha,n) computes the 2 tailed probability (so that half of alpha is in each
tail),
8.3 The College of Business Administration is conducting an economic impact study of
the effect the University has on the Flagstaff community. One of the items of the study is
student spending in the area. Suppose that a sample of 400 students is taken to determine
their spending habits in the town (exclusive of rent.). The average spending of the
students in the sample is $250 per month with a standard deviation of $60 per month.
Find a 99% confidence interval for the mean town spending of all NAU students. Assume
spending is normally distributed.
X  t / 2 sx
X  250
s
60
60


3
n
400 20
 t 0.005  2.617 (for 399 df)
sx 
t / 2
X  t / 2 sx  250  2.617(3)  250  7.85
 242.15, 257.85
xbar
n
s
s xbar
t
UCL
LCL
250
400
60
3
2.588204
257.764
242.236
= 60 / SQRT(400)
= TINV(0.01,399)
= 250 + 2.588 * (3)
= 250 - 2.588 * (3)
8.4 A computer chip manufacturer is interested in the proportion of defective central
processor chips being produced. A sample of n=100 chips is taken and 8 are found to be
defective. Find a 95% confidence interval for the true proportion of defective chips being
produced during operations.
X 8
n  100
X
8
pˆ  
 0.08, qˆ  1- pˆ  0.92
n 100
npˆ  100(0.08)  8  5
nqˆ  100(.92)  92  5, so CLT holds
ˆ ˆ / n  (0.08)(0.92) /100  0.0271
 pˆ  pq
  0.05 so Z / 2  Z 0.025  1.96
pˆ  Z / 2 pˆ  0.08  (1.96)(0.0271)  0.08  0.053
0.027, 0.133
8.5 The specifications for a certain assembly call for bolts with a pitch of 950 mm. A
new shipment of bolts for this assembly arrives and n=100 of them are taken for
inspection. This sample gives a mean of 960 mm and a standard deviation of 10 mm.
Find a 90% confidence interval for the pitch of the bolts in this sample. Does it seem
likely that the bolts in the shipment meet the specifications of the assembly? Assume
that the bold diameters are normally distributed.
X  t / 2 sx
X  960
s
10
10

 1
n
100 10
 t 0.05  1.671 (for 99 df)
sx 
t / 2
X  t / 2 sx  960  1.617(1)  960  1.617
958.33,961.67
xbar
n
s
s xbar
t
UCL
LCL
960
100
10
1
1.660392
961.66
958.34
= 10 / SQRT(100)
= TINV(0.1,99)
= 960 + 1.66 * (1)
= 960 - 1.66 * (1)
Because the population is normally distributed, we can assume that the CLT holds
and that we can use the t distribution. Note that the CI does not include the stated
mean. It is highly unlikely that the process is producing with the desired mean.
8.6 A sample of n=20 items is taken from a normal distribution. The sample results are
X  80 and s  10 . Find a 90% confidence interval for  .
X  t / 2 sx
X  80
s
10

 2.236
n
20
 t 0.05  1.729 (for 19 df)
sx 
t / 2
X  t / 2 sx  80  1.729(2.236)  80  3.886
76.134,83.866
xbar
n
s
s xbar
T
UCL
LCL
80
20
10
2.236067977
1.729131327
83.866044
76.133956
= 10/SQRT(20)
= TINV(0.1,19)
= 80 + 1.729*(2.236)
= 80 - 1.729*(2.236)
The distribution the sample was taken from is normal, hence the CLT holds and the use
of the t distribution is permissible.
8.7 In examining the credit accounts of a department store, an auditor selected a random
sample of 10 accounts and found that the average account error was $-\$37.00$ with a
standard deviation of $\$15.00$ (these are sample results). Construct a $90\%$
confidence for the population mean. Assume that the accounting errors are normally
distributed.
X  t / 2 sx
X  37
s
15

 4.743
n
10
 t 0.05  1.833 (for 9 df)
sx 
t / 2
X  t / 2 sx  37  1.833(4.743)  37  8.69
 45.69, 28.31
xbar
n
s
s xbar
T
UCL
LCL
-37
10
15
4.74341649
1.833113856
-28.306081
-45.693919
= 15/SQRT(10)
= TINV(0.1,9)
= -37 + 1.833*4.743
= -37 - 1.833*4.743