Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
D ⳱ sum of 兩O ⫺ E兩 D is the total of the absolute deviations of the observed from the estimated frequencies and is the sum of the 兩O ⫺ E兩 entries in Table 7.2 We now have a way to help us find out whether a die is fair. We use the six-step hypothesis testing method to repeatedly compute tables like Table 7.2 using a fair die (the null hypothesis model) and thus repeatedly find the value of D. If D ⳱ 24 (from the actual Key Problem data) is not unusually large compared with the typical values obtained for a fair die, this is an indication that the die is quite possibly fair. But if D ⳱ 24 is unusually large, this suggests that the die is not fair and that we should reject its null hypothesis of a fair die. We leave it to the next section to help us know what a “small” or a “large” value of D is with respect to providing strong or weak evidence of an unfair die—that is, with respect to providing strong or weak evidence that the null hypothesis should be rejected. 7.2 HOW BIG A DIFFERENCE MAKES A DIFFERENCE? We were left in Section 7.1 with the question of how to decide if a calculated value of the statistic D is “small” or “large.” That is, we need a way of deciding whether an observed value of the D statistic is so large that it would happen only seldom by chance for rolls of a fair die. That is, we seek to estimate p(D ⱖ 24) for a fair die. Suppose we have a six-sided die that we know is fair (for example, the manufacturer has carefully constructed it to be a fair die). We roll it 60 times and obtain the results shown in Table 7.3. Let’s look at the sum of 兩O ⫺ E兩, which is 20. Of the 60 rolls of this fair die, the number of observed outcomes differed from what was expected by a total of 20. We will repeat these 60 rolls of the fair die many times, and each time we will calculate a value of D. We will have as a result a frequency table of D values obtainable from 60 rolls of a fair die, and from this frequency table we will be able to determine the experimental probability of observing Table 7.3 Results of 60 Rolls of a Six-Sided Die Known to Be Fair Outcome Obtained frequency (O) Expected frequency (E) O⫺E 兩O ⫺ E兩 1 2 3 4 5 6 4 6 11 10 15 14 10 10 10 10 10 10 ⫺6 ⫺4 1 0 5 4 6 4 1 0 5 4 Total 60 60 0 20 a D value of 24 or greater in the case of a fair die. If a D value of 24 is unusually high for a fair die, we would conclude that the die of Table 7.2, which yielded a D value of 24, is not fair. In summary, we would like to find P(D ⱖ 24) for 60 rolls of a fair six-sided die, thus estimating the theoretical probability p(D ⱖ 24). We will take the six-step hypothesis testing approach of Chapter 6: 1. Choice of a Model: We take a six-sided die that we know to be fair. (A box model using 1,2,3,4,5,6 is possible too.) 2. Definition of a Trial: We roll the die 60 times, and we record the outcomes of the trial in a table like Table 7.3 (or sample with replacement from a box model). 3. Definition of a Successful Trial: We calculate D for each trial. Count as successful a trial in which D ⱖ 24. 4. Repetition of Trials: We do a moderately large number of trials, say, 30 (100 would be better). The results of 30 trials are presented in Table 7.4. 5. Finding the Probability of a Successful Trial: We estimate the theoretical probability of a successful trial—that is, of getting a D that is 24 or greater—using the results of our experiment. According to Table 7.4, the largest value of D obtained in the 30 trials was 20. In our 30 trials, then, we did not get a value of D that was 24 or larger. Therefore, P(D is greater than or equal to 24) ⳱ 0 6. Decision: We found that, on the basis of 30 trials, the experimental probability of D being greater than or equal to 24 in 60 rolls of a fair die is zero. Recall from Chapter 6 that the convention is to consider as unusual any event whose probability is 0.05 or less. Since a probability of zero is less than our criterion of 0.05 for an unusual event, we conclude that it is Table 7.4 Frequency Table for D Statistic D f 4 6 8 10 12 14 16 18 20 1 2 3 5 5 6 3 3 2 Total 30 unlikely to get a D that is greater than or equal to 24, if the die is fair. So we conclude that the die of Table 7.2 is not fair. That is, we reject the null hypothesis model of a fair die. Of course, if we are to be able to trust our step 5 experimental probability estimate and hence our decision, we should have done more trials of the 60-roll die experiment, say, 100 trials. SECTION 7.2 EXERCISES Answer these questions using the D statistic and the required table of trials. 1. Here are the results of rolling a six-sided die 60 times. Calculate D. Outcome f 1 2 3 4 5 6 4 17 14 6 18 1 Total 60 Using Table 7.4 or creating a new table by using the six-step hypothesis-testing method and doing many trials, decide whether the die is fair. 2. Suppose we roll a six-sided die that we assume is fair. How many times would we expect each side to occur if we roll the die a. 150 times? b. 300 times? c. 600 times? 3. Nancy and Pete go through one page of a telephone book and write down the last digit of 50 telephone numbers. Here are their data: Digit: f: 0 1 1 6 2 3 3 2 4 5 5 8 6 2 7 10 8 8 9 5 Prepare a table of obtained and expected outcomes like Table 7.3, and find the value of D. The following table gives the results of 30 simulations of 50 random digits and their associated D’s. D f 4 6 8 10 12 14 16 18 20 22 24 1 0 2 1 2 7 7 2 4 3 1 Do you think the telephone book was a good source of random numbers? Explain. 4. A breakfast cereal company features a special offer by including one of four differently colored ballpoint pens in a box. In a shopping trip that resulted in 20 boxes of cereal, the following numbers of pens were obtained. Do you think that the company is distributing the pens in equal numbers of colors, or are some colors more likely to be obtained than others? What is the value of D? Color f Blue Yellow Red White 8 4 3 5 Total 20 5. Explain why large values of D suggest that a die may not be fair.