Download Appendix A

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Degrees of freedom (statistics) wikipedia , lookup

Psychometrics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Omnibus test wikipedia , lookup

Misuse of statistics wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
chpapter12-appendixA.qxd
3/18/03
2:21 PM
Page 1
Chapter Twelve Appendix A
1
Chapter 12
Appendix A
12A.1 Comparing Two Population
Variances
In this section we develop a procedure to test the hypotheses about the relative values
of the variances of two normally distributed populations. The following example
illustrates the use of the test:
· Economists commonly use variance of rate of return on a stock as the measure of
volatility of a stock. A stock market analyst may be interested in testing whether two
stocks are equally volatile, or whether one of the stocks is more volatile than the
other.
· An assumption made in quite a few procedures for statistical tests of hypothesis,
concerning two or more populations, is that the population variances are equal.
Recall in Chapter 11 of the text we made such an assumption in a t test: to investigate whether the means of two populations differed. We make this assumption in all
the tests developed in Chapter 12.
Let us go back to the example we considered in Chapter 11 of the text book regarding the difference in mean values of prices of houses on sale in East Vancouver and
Oshawa. The management of a multinational company conducted a test to check if
there is evidence in support of the employees’ claim that the mean value of prices of
houses on sale in East Vancouver is more than $60 000 higher than the mean value
of prices of the houses on sale in Oshawa. They used the information that the two
population distributions are approximately normal. Also, they assumed that the variances of the two populations are not equal. The management would like to test if there
is sufficient evidence (at 0.1 significance level) that their assumption that the variances
are not equal is true. Let us conduct the test using the same sample data as we used in
the last chapter. We reproduce it below. You may recall that the two samples are
selected independently from the two populations.
East Vancouver
345
Oshawa
219 122 200 134 179 204 129 132 174 142 136 159 168 170 227
290
279
259
410
174
252
455
228
369
Again, we shall follow the five-step procedure.
Step 1
2
Let the variances of the prices of houses on sale in East Vancouver and Oshawa be ␴ 1
2
and ␴ 2 . The null and alternative hypotheses are:
H0: ␴ 12 = ␴ 22
H1: ␴ 12 ¹ ␴ 22
chpapter12-appendixA.qxd
2
3/18/03
2:21 PM
Page 2
Chapter Twelve Appendix A
These can be stated as:
H0 :
␴ 12
2
␴2
=1
2
H1 :
␴1
2
␴2
≠1
Step 2
We have selected the significance level, ␣ = 0.1.
Step 3
We have chosen independently random samples from the two populations. A good
choice of a test statistic is then:
F=
2
S12
12A-1
S22
2
Here, S1 and S2 are the sample variances of samples drawn from populations 1 and 2,
respectively.
Step 4
If the population distributions are normal, and the null hypothesis is true, the test
statistic follows the F distribution with ((n1 – 1), (n2 – 1)) degrees of freedom.1 Here,
n1 is the size of the sample selected from population 1 (= 10) and n2 is the size of sample
selected from population 2 (= 15). We call this the F statistic and the corresponding
test an F test.
Since in our example, the population distributions are known to be approximately
normal, the test statistic is approximately an F statistic. Our test is approximately
a two-tailed F test. The critical values are F␣ / 2 = F0.05 and F(1−␣ / 2) = F0.95.
Our decision rule is: reject H0 if the computed F value (value of the test
statistic) is less than F0.95 or if it is greater than F0.05.
From Appendix D at the back of the text, we have F0.05 (df = (9, 14)) = 2.65. Recall that
through Formula 12-1 on text page 501, F0.95 ( df = (9, 14)) equals
1
F0.05
( df = (14, 9))
1
. (Since the table does not give value corresponding to degrees of freedom (14, 9),
3.01
we shall approximate it by the value corresponding to degrees of freedom (15, 9).)
S12
1
S2
Note that checking if 2 is less than
is equivalent to checking if 22 is greater
3.01
S2
S1
than 3.01.
S12
The decision rule can thus be stated as: reject H0 if the computed value of 2
S2
2
S
is greater than 2.65 or if the computed value of 22 is greater than 3.01.
S1
≈
1
For details, refer to Chapter 12 Appendix B file on this CD-ROM.
chpapter12-appendixA.qxd
3/18/03
2:21 PM
Page 3
Chapter Twelve Appendix A
Step 5
3
From the sample data, we get
345 + L + 369
219 + L + 227
= 306.1, x2 =
= 166.3
10
15
( 345 − 306.1)2 + L + ( 369 − 306.1)2
2
s1 =
= 7569.43
9
2
2
(219 − 166.3) + L + (227 − 166.3)
2
= 1170.81
s2 =
14
x1 =
s12
= 6.47 is greater than 2.65. Hence, we reject the null hypothesis.
2
s2
We have sufficient evidence, at 0.1 significance level, to conclude that the variances of
the two populations are not equal.
The above example illustrates a two-tailed F test for equality of variances of two
populations, when samples are selected independently from two normally distributed
populations. We summarize below the decision rules for a one-tailed, and a two-tailed
F test. We consider the null hypothesis to be of the form
The value
2
H0 :
␴1
2
␴2
2
= 1;
or
H0 :
␴1
2
␴2
≤1
Therefore, the alternative hypothesis, which is the complement of the null hypothesis, is of the form
2
H1 :
␴1
2
␴2
2
≠ 1;
or
H1 :
␴1
2
␴2
>1
These are, respectively, a two-tailed test and an upper-tailed test. In case the null
2
␴
hypothesis is of the form H0 : 12 ≥ 1, we interchange the labels 1, and 2, of the
␴2
2
␴
populations, to convert it to the form H0 : 12 ≤ 1 . Our choice of test statistic is given
␴2
by Formula 12A-1, that is,
Test Statistic for Comparing
Two Variances
Case
Upper-Tailed Test
Two-Tailed Test
F=
S12
2
S2
Decision Rule
Reject H0 if the computed F value is greater than F␣ (degrees of
freedom (n 1 - 1, n 2 - 1)).
Reject H0 if the computed F value is greater than F␣/2 (degrees of
freedom (n 1 - 1, n 2 - 1)), or if (1/(the computed F value)) is greater
than F␣/2 (degrees of freedom (n 2 - 1, n 1 - 1)).
Appendix D in the text considers only the 0.05 and 0.01 significance levels.
However, for any other value of level of significance, the result can be obtained using
Minitab or Excel. We give the instructions for both below, together with their outputs
for the above example.
The Minitab output gives the result of the F test, together with the result of another
test called Levene’s test, which we shall not discuss. It also provides box plots for two
chpapter12-appendixA.qxd
4
3/18/03
2:21 PM
Page 4
Chapter Twelve Appendix A
1
MINITAB INSTRUCTIONS
1. Input the first sample data
in column C1 and the second in column C2.
2. Click Stat, Basic Statistics, 2 Variances ...
3. Choose Samples in different columns, input First = C1,
and Second = C2. Click OK.
2
Start
1
2
3
EXCEL INSTRUCTIONS
1. Input the sample data 1 in column A and the
Sample data 2 in column B.
2. Click Tools, Data Analysis ..., F-test Two-Sample for
Variances, OK.
3. Input Variable 1 Range = range of sample data 1 in
column A, Variable 2 Range = range of sample data
2 in column B, Alpha = 1/2 (value of level of significance), in case of a two-tailed test, and = (value of
level of significance) in case of one-tailed test.
Click OK.
chpapter12-appendixA.qxd
3/18/03
2:21 PM
Page 5
Chapter Twelve Appendix A
5
samples. We give only that part of the Minitab output that we are interested in here
(see opposite). As we see, the value of the test statistic in the Minitab output (6.465)
equals the value we computed above. Also, the Minitab output gives only the P-value
for a two-tailed test, which in this case is 0.002. Since the selected level of significance
(= 0.1) is greater than the P-value, the decision based on the P-value is the same as
above. (That is: reject H0 in favour of H1.)
The computed F value (= 6.46512...) in the Excel output is the same (up to rounding) as the one we obtained before.
Excel gives output only for an upper-tailed F test. Hence,
In the case of a two-tailed test:
In the case of an upper-tailed test:
Input value of Alpha equal to 1/2 (significance
level). The final P-value is the smaller of
2(P (F < = f )) and 2(1 - P (F < = f )).
Input Alpha = value of significance level. In this
case, true P-value is as given in the Excel output.
The F Critical value given in the Excel output applies only to an upper-tailed test.
Self-Review 12A-1
Rajesh Nagy and Debbie Richmond work as quality control inspectors at Steele Electric
Products, Inc., which assembles electrical components for stereo equipment. For a random sample of 10 days, Rajesh averaged 9 rejects per day with a sample standard deviation of 2 rejects per day. For an independent random sample of 10 days, Debbie
averaged 8.5 rejects with a sample standard deviation of 1.5 rejects. Assuming that the
population distributions are approximately normal, can we conclude, at the 0.05 significance level, that the variance of the number of rejects per day attributed to Rajesh
is larger?
EXERCISES 12A-1 TO 12A-8
In each of the following, assume that the population distributions are approximately
normal.
12A-1. We wish to conduct a two-tailed F test, for equality of variances of two
populations, using a 0.10 significance level. What are the critical F values
for a sample of six observations on population 1 and an independent sample
of four observations on population 2?
12A-2. We wish to conduct an upper-tailed F test, for a comparison of variances of
two populations, using a 0.01 significance level. What is the critical F value
for a sample of four observations on population 1 and an independent sample
of seven observations on population 2?
12A-3. A random sample of eight observations from a population resulted in a sample
standard deviation of 10. A random sample of six observations from a second
population resulted in a sample standard deviation of 7. Can we conclude,
at the 0.02 significance level, that there is a difference in the variances of the
two populations?
chpapter12-appendixA.qxd
6
3/18/03
2:21 PM
Page 6
Chapter Twelve Appendix A
12A-4. A random sample of five observations from a population resulted in a sample
standard deviation of 12. A random sample of seven observations from a second
population showed a sample standard deviation of 7. Can we conclude, at the
0.01 significance level, that the variance of the first population is greater?
12A-5. Stargell Research Associates conducted a study of the radio listening habits
of men and women. One facet of the study involved the listening time.
The mean listening time for a sample of 10 men was found to be 35 minutes
per day. The sample standard deviation was 10 minutes per day. The mean
listening time for the 12 women studied was also 35 minutes per day,
but the sample standard deviation was 12 minutes per day. Can we conclude,
at the 0.10 significance level, that there is a difference in the population
variances of the listening times for men and women?
12A-6. A stockbroker at Critical Securities reported that the mean rate of return on
a sample of 10 oil stocks was 12.6 percent with a sample standard deviation
of 3.9 percent. The mean rate of return on a sample of 8 utility stocks was
10.9 percent with a sample standard deviation of 3.5 percent. Can we
conclude, at the 0.05 significance level, that the population variance of rates
of return of oil stocks is greater?
12A-7. A real estate agent in the coastal area of Nova Scotia wants to compare the
variation in the selling prices of homes on the oceanfront with those one to
three blocks from the ocean. A sample of 21 oceanfront homes sold within
the last year, revealed that the sample standard deviation of the selling prices
was $45 600. A sample of 18 homes, also sold within the last year, that were
one to three blocks from the ocean revealed that the sample standard
deviation was $21 330. At the 0.01 significance level, can we conclude that
the population variance of the selling prices is larger for the oceanfront
homes?
12A-8. A computer manufacturer is about to unveil a new, faster personal computer.
The new machine clearly is faster, but initial tests indicate there is more
variation in the processing time. The processing time depends on
the particular program being run, the amount of input data, and the amount
of output. Random, independent samples of 16 computer runs each, covering
a range of production jobs, showed that the standard deviation of
the processing times was 22 (hundredths of a second) for the new machine
and 12 (hundredths of a second) for the current machine. At the 0.05
significance level can we conclude that the population variance of the
processing times is greater for the new machine?
chpapter12-appendixA.qxd
3/18/03
2:21 PM
Page 7
Chapter Twelve Appendix A
12 Appendix A
Answer to self-review
12A-1. Let Rajesh’s assemblies be population 1. Then
␴ 12
F=
2
␴2
≤ 1; H1 :
␴ 12
>1
2
␴2
The rejection region is F > 3.18.
H0 :
2 .0 2
= 1.78
1.52
Since, 1.78 < 3.18, H0 is not rejected. There is
insufficient evidence, at ␣ = 0.05, to conclude
that the variance of the number of rejects
attributed to Rajesh is greater.
7
chpapter12-appendixA.qxd
3/18/03
2:21 PM
Page 8
Answers
to Odd-Numbered Appendix Exercises
12A-1. 9.01 and 5.41. Reject H0 if F =
or if
s12
s22
> 9.01
1
> 5.41.
F
12A-3. Decision rule: reject H0 if F > 10.5 or if
1
> 7.46; F = 2.04; do not reject H0.
F
12A-5. H0 :
␴ 12
␴ 22
= 1; H1:
␴ 12
␴ 22
H0 if F > 2.90 or if
≠ 1 ; decision rule: reject
1
> 3.1; F = 0.694; do not
F
reject H0.
␴ 12
2
␴2
2
≤ 1; H1 :
␴1
> 1; decision rule: reject
2
␴2
H0 if F > 3.16. F = 4.57; reject H0.
12A-7. H0 :