Download 7.2 ow big a difference makes a difference?

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
Transcript
D ⳱ sum of 兩O ⫺ E兩
D is the total of the absolute deviations of the observed from the estimated
frequencies and is the sum of the 兩O ⫺ E兩 entries in Table 7.2
We now have a way to help us find out whether a die is fair. We use
the six-step hypothesis testing method to repeatedly compute tables like
Table 7.2 using a fair die (the null hypothesis model) and thus repeatedly
find the value of D. If D ⳱ 24 (from the actual Key Problem data) is not
unusually large compared with the typical values obtained for a fair die, this
is an indication that the die is quite possibly fair. But if D ⳱ 24 is unusually
large, this suggests that the die is not fair and that we should reject its null
hypothesis of a fair die. We leave it to the next section to help us know what
a “small” or a “large” value of D is with respect to providing strong or weak
evidence of an unfair die—that is, with respect to providing strong or weak
evidence that the null hypothesis should be rejected.
7.2 HOW BIG A DIFFERENCE MAKES A DIFFERENCE?
We were left in Section 7.1 with the question of how to decide if a calculated
value of the statistic D is “small” or “large.” That is, we need a way of
deciding whether an observed value of the D statistic is so large that it
would happen only seldom by chance for rolls of a fair die. That is, we seek
to estimate p(D ⱖ 24) for a fair die.
Suppose we have a six-sided die that we know is fair (for example, the
manufacturer has carefully constructed it to be a fair die). We roll it 60 times
and obtain the results shown in Table 7.3. Let’s look at the sum of 兩O ⫺ E兩,
which is 20. Of the 60 rolls of this fair die, the number of observed outcomes
differed from what was expected by a total of 20.
We will repeat these 60 rolls of the fair die many times, and each time
we will calculate a value of D. We will have as a result a frequency table
of D values obtainable from 60 rolls of a fair die, and from this frequency
table we will be able to determine the experimental probability of observing
Table 7.3
Results of 60 Rolls of a Six-Sided Die Known to Be Fair
Outcome
Obtained frequency
(O)
Expected frequency
(E)
O⫺E
兩O ⫺ E兩
1
2
3
4
5
6
4
6
11
10
15
14
10
10
10
10
10
10
⫺6
⫺4
1
0
5
4
6
4
1
0
5
4
Total
60
60
0
20
a D value of 24 or greater in the case of a fair die. If a D value of 24 is
unusually high for a fair die, we would conclude that the die of Table 7.2,
which yielded a D value of 24, is not fair. In summary, we would like to find
P(D ⱖ 24) for 60 rolls of a fair six-sided die, thus estimating the theoretical
probability p(D ⱖ 24).
We will take the six-step hypothesis testing approach of Chapter 6:
1. Choice of a Model: We take a six-sided die that we know to be fair. (A
box model using 1,2,3,4,5,6 is possible too.)
2. Definition of a Trial: We roll the die 60 times, and we record the
outcomes of the trial in a table like Table 7.3 (or sample with replacement
from a box model).
3. Definition of a Successful Trial: We calculate D for each trial. Count
as successful a trial in which D ⱖ 24.
4. Repetition of Trials: We do a moderately large number of trials, say,
30 (100 would be better). The results of 30 trials are presented in Table 7.4.
5. Finding the Probability of a Successful Trial: We estimate the theoretical probability of a successful trial—that is, of getting a D that is 24 or
greater—using the results of our experiment. According to Table 7.4, the
largest value of D obtained in the 30 trials was 20. In our 30 trials, then, we
did not get a value of D that was 24 or larger. Therefore,
P(D is greater than or equal to 24) ⳱ 0
6. Decision: We found that, on the basis of 30 trials, the experimental
probability of D being greater than or equal to 24 in 60 rolls of a fair die is
zero. Recall from Chapter 6 that the convention is to consider as unusual
any event whose probability is 0.05 or less. Since a probability of zero is
less than our criterion of 0.05 for an unusual event, we conclude that it is
Table 7.4
Frequency
Table for D Statistic
D
f
4
6
8
10
12
14
16
18
20
1
2
3
5
5
6
3
3
2
Total
30
unlikely to get a D that is greater than or equal to 24, if the die is fair. So
we conclude that the die of Table 7.2 is not fair. That is, we reject the null
hypothesis model of a fair die. Of course, if we are to be able to trust our
step 5 experimental probability estimate and hence our decision, we should
have done more trials of the 60-roll die experiment, say, 100 trials.
SECTION 7.2 EXERCISES
Answer these questions using the D statistic and
the required table of trials.
1. Here are the results of rolling a six-sided die
60 times. Calculate D.
Outcome
f
1
2
3
4
5
6
4
17
14
6
18
1
Total
60
Using Table 7.4 or creating a new table by
using the six-step hypothesis-testing method
and doing many trials, decide whether the die
is fair.
2. Suppose we roll a six-sided die that we assume is fair. How many times would we
expect each side to occur if we roll the die
a. 150 times?
b. 300 times?
c. 600 times?
3. Nancy and Pete go through one page of a
telephone book and write down the last digit
of 50 telephone numbers. Here are their data:
Digit:
f:
0
1
1
6
2
3
3
2
4
5
5
8
6
2
7
10
8
8
9
5
Prepare a table of obtained and expected outcomes like Table 7.3, and find the value of
D. The following table gives the results of
30 simulations of 50 random digits and their
associated D’s.
D
f
4
6
8
10
12
14
16
18
20
22
24
1
0
2
1
2
7
7
2
4
3
1
Do you think the telephone book was a good
source of random numbers? Explain.
4. A breakfast cereal company features a special
offer by including one of four differently colored ballpoint pens in a box. In a shopping
trip that resulted in 20 boxes of cereal, the
following numbers of pens were obtained. Do
you think that the company is distributing the
pens in equal numbers of colors, or are some
colors more likely to be obtained than others?
What is the value of D?
Color
f
Blue
Yellow
Red
White
8
4
3
5
Total
20
5. Explain why large values of D suggest that a
die may not be fair.