Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
sample—the distribution of the number of heads in 10 fair coin tosses, for example. An example of such a distribution is that of chi-square, discussed in Chapter 7. One benefit of the chi-square distribution is that a probability involving a chi-square statistic (a statistic computed from a random sample) can be approximated by the area under a particular smooth curve, called a chi-square density. We can therefore use a table of probabilities for this chi-square distribution to bypass the six-step method when doing a chi-square test, obtaining from a chi-square table the probability (under the null hypothesis) of a chi-square statistic being larger than the observed chi-square value. Such a probability is determined by computing the area under the density lying to the right of the specified value. The chi-square density is a continuous distribution (recall Figure 7.4), whereas the above two specified distributions are discrete. Both types are very important in statistics. In summary, by the distribution of a statistic (also called a random variable) we mean the set of the theoretical probabilities of its possible values in the discrete case, and a density in the continuous case. In Section 6.1 we discussed the concepts of sample and population. As discussed above, the population distribution gives the theoretical probability of each possible value of the statistic resulting from randomly sampling once from a population. When the population is a large real physical population, such as the heights of adult males, we often use a continuous distribution whose density has a shape approximately that of the relative frequency histogram (also called probability histogram) of the population. For example, the probability law for sampling one adult male’s height is given by the famous bell-shaped curve (called a normal density), because the relative frequency histogram of all the population’s heights is bell shaped. The population distribution is selected to approximate the relative frequency histogram of the population. We will first discuss the discrete distribution associated with yes/no or success/failure situations, such as coin tossing or medical drug testing. 8.1 THE BINOMIAL DISTRIBUTION: SUCCESS COUNTS The simplest random experiments focus on only two possible results. Buy a lottery ticket: you either win or lose. Flip a coin: it comes up heads or tails. Randomly choose a person from the United States: does the person have confidence in the public schools? Yes or no. Deal a 5-card hand from a typical 52-card deck: all the cards are from the same suit (a flush), or at least two suits appear. A person is given a flu shot: the person either does or does not contract the flu. A couple has a baby: it is either a girl or a boy. On a given day in a given city, it either snows or does not snow. The statistic in such an experiment is either 0 or 1. One has to decide which result is assigned 1 and which is assigned 0. For example, if your lottery ticket wins, you record 1, but if you lose, you record 0. Often the outcome associated with the 1 is called a “success,” especially if the context suggests that one outcome is good and the other is bad. Suppose you buy one lottery ticket every day for a week. Each ticket has a 25% chance of winning (the amount won is usually fairly small). The population distribution is p(win) ⳱ p(1) ⳱ 14 and p(lose) ⳱ p(0) ⳱ 34 . What might the data look like? Here is a typical week: Day Monday Tuesday Wednesday Thursday Friday Saturday Sunday Win or lose Statistic Win Lose Lose Lose Lose Win Lose 1 0 0 0 0 1 0 Monday and Saturday happened to have winners. The other days lost. Instead of looking at the above table as representing seven small experiments, consider it to represent one larger experiment that consists of buying seven lottery tickets and counting the number of winners. (How would you build and sample from a box model to simulate seven purchases of a lottery ticket with a 25% chance of winning for each purchased ticket?) The statistic of interest is the number of winners (you do not care on which days you won, just on how many days you won). Notice that the statistic for the number of winners for the week is just the sum of the statistics for the seven days: 1Ⳮ0Ⳮ0Ⳮ0Ⳮ0Ⳮ1Ⳮ0 ⳱ 2 That is, adding up the 1s counts the number of successes in the larger trial or experiment. For simplicity we will call the random purchase of each ticket a trial rather than a subtrial. What is the chance of having no winning ticket in a week? Of having three winners? Or of having all seven winners? That is, what is the distribution of this sum? In other words, what is the distribution of the statistic of interest computed by summing the sample of 7? Recall that the term distribution simply means the specification of the probabilities for the various possible values of the statistic of interest, in this case 0, 1, . . . , 7. The answer is the binomial distribution. Bi means two, nom means name. The idea is that each day there are two possible results: win or lose. For a random experiment in the real world to yield a statistic with a binomial distribution, several conditions have to be satisfied: 1. There is a fixed number of individual trials. 2. Each trial yields a statistic of either 0 or 1. 3. The chance of an individual trial yielding a 1 is the same for each trial. 4. The trials are independent. That is, the 1s and 0s that occurred on prior trials have no influence on whether the current trial produces a 1 or a 0. (It may be helpful to review the discussion on independence in Section 4.3.) If the four conditions are satisfied, the sum of the individual statistics (the number of successes) has a binomial distribution. Later in the chapter we will learn how to compute binomial probabilities. We refer to an experiment satisfying the four conditions as a binomial experiment. The lottery example appears to satisfy the conditions very well. Condition 1: there are exactly seven trials planned. Condition 2: each trial is either a winner, in which case the statistic is 1, or a loser, in which case the statistic is 0. Condition 3: the chance of winning (getting a 1) on any one day is 25%. Condition 4, independence, supposes that whether you win one day does not have any effect on whether you win on any of the other days. Theoretically that should be true. Flipping a coin 25 times and counting the number of heads is also well modeled by a binomial distribution. (Ask yourself why the four required conditions hold.) Some other examples follow. Polling: The Gallup organization conducted a survey during May 28–29, 1996, to assess the public’s confidence in various institutions. It interviewed 1019 adults. These people were chosen randomly in such a way that everyone in the population had an equal chance of being chosen. For each person, a 1 was recorded if the person expressed confidence in the public schools, and otherwise a 0 was recorded. (More specifically, a 1 means the person had a “great deal” or “quite a lot” of confidence, while a 0 means the person had only “some” confidence, “very little” confidence, or “no” confidence or did not know.) The number of 1s was 387; that is, 387 of the 1019 people had confidence in the public schools. That is about 38%. Think about the four conditions for a binomial experiment. First, the number of individual trials is 1019, the number of people interviewed. Did the Gallup Organization fix that number in advance, or actually interview a random number of people? Probably the number was not exactly fixed, but for practical purposes it can probably be considered so. Its polls often have around 1000 respondents, but because of the vagaries of finding people at home, having working telephones, and so on, the actual number randomly fluctuates a bit, depending on how lucky the polling organization is in reaching the people it intends to sample. Second, each trial indeed yields a 1 or a 0. Third, what is the chance that a given chosen person has confidence in the public schools? Here we have to be careful about where the randomness is. Imagine the entire population of adults in the United States. Some have confidence in the public schools; some do not. For argument, suppose 40% do have confidence. Then the chance that a randomly chosen person is one of those with confidence in the public schools is 40%, or 25 . This chance is exactly the same for the first person chosen, the second chosen, and so on, so condition 3 is satisfied. This last statement is not intuitive for many students. It becomes clearer if one realizes that the 25 chance of the second person having confidence in the public schools is figured without knowledge of the first person’s response. To make this clear, consider a simple example of two people buying milk in the case in which one of the five bottles in the grocery cooler is spoiled. The chance that the first person buys the spoiled milk is of course 15 . But the chance for the second person is also 15 , because the result from the first person is not known. For example, do you care whether you are the first or the second purchaser if as the second purchaser you have no knowledge about the first person’s purchase? Were people chosen independently? In a practical sense, yes. Thus the resulting 1s and 0s of the 1019 trials will be independent as required. There could be a slight problem of dependency between people selected because a person chosen once cannot be chosen again, which is to say that the population being drawn from changes slightly from trial to trial. For example, if a lot of 1s have occurred, then the chance of getting a 1 in the current trial with so many people with 1s already selected may be less. But the population of the United States is so much larger than 1019 that the chance of getting a 1 on the current trial is for all practical purposes uninfluenced by the results of previous trials. For all practical purposes, then, independence holds. (Reviewing Section 5.7 may be useful here.) The result of this poll does appear to be reasonably well modeled by a binomial distribution, although there are a few subtleties to suggest caution about our conclusion. Drawing cards: This example uses just the four aces and four kings from a regular deck. Shuffle those eight cards well, and choose five off the top. Count the number of aces you draw. Is this number binomial? The first condition holds: there are five individual trials. The second holds, too. Each draw is either 1, if it is an ace, or 0, if it is a king. The third condition is true, because the chance of an ace being drawn is 12 for any of the cards. (Remember that you are not considering the result of previous trials when you say the probability of an ace in the third trial, say, is 12 .) The fourth condition is false. The chance that the second card is an ace is not independent of the first card. If the first card is an ace, the chance that the second is an ace is 37 , because three aces and four kings are left. If the first card is a king, the chance that the second card is an ace is 47 . What is the chance that the fifth card is an ace if the first four cards are aces? What is the chance that the fifth card is an ace if the first four cards are kings? Moreover, 3 1 7 is a lot less than 2 . So, as was not the case in the above polling example, where the dependence was slight and hence ignorable, here it is substantial. The number of aces should not be modeled as a binomial. Children: For a randomly selected family, is the number of girls among the children binomial? Certainly, each child is either a girl or boy, and the chance that any given child is a girl is about 49% (generally slightly more boys are born than girls). The genders of the children in a family are probably reasonably independent, especially if we discount the occurrence of twins. But condition 1 may not be valid: in some cultures many parents want to have at least one son, so the number of children is not fixed, because they keep having children until they have at least one son. Therefore the number of girls may not be binomial. Of course, if we only considered three-child families, the binomial distribution would work well. Snow: Each day of the year, it either does or does not snow in Detroit. Is the number of days it snows during a given year binomial? No. The first two conditions are valid, but the third is not. The chance of snow on winter days is obviously much higher than on summer days. What about condition 4? If it is snowing on one day, then clearly the chance of snow on the next day is higher. Thus condition 4 fails too! Binomial Probabilities By knowing the number of individual trials and the chance of a 1 on an individual trial, it is possible to find the chance of obtaining a given number of 1s, or successes, in a binomial experiment. For example, we will simulate the above lottery example. There are seven individual trials, and each has a 1 4 chance of yielding a 1. We generate seven individual trials 100 times, and each time we find the number of winners. Table 8.1 contains the number of times we obtained each possible sum (for example, 10 simulated weeks had no winner, and 37 had one winner), the proportion (experimental probability) of seven-trial experiments in which we obtained each sum, and the theoretical probabilities. Table 8.1 p (number of winners) for Success Probability 1 4 Number of winners Experimental frequency Experimental probability Theoretical probability 0 1 2 3 4 5 6 7 10 37 32 17 3 1 0 0 0.10 0.37 0.32 0.17 0.03 0.01 0.00 0.00 0.1335 0.3115 0.3115 0.1730 0.0577 0.0115 0.0013 0.0000 In the next section we will show how to figure out the theoretical probabilities that are given in the last column. Note, as expected, that the experimental probabilities are reasonably close to the theoretical ones. If instead of simulating 100 weeks we simulated 400 or 1000, the experimental probabilities would be even closer to the theoretical probabilities. Although the estimate of the theoretical probability of getting seven winners (outcome is 1111111) rounds to 0, it is possible. The theoretical probability of doing so is easy to figure out. Since each individual day has a 1 4 chance of winning, and the seven days are independent, the chance that all seven are winners is 冢4冣冢4冣冢4冣冢4冣冢4冣冢4冣冢4冣 ⳱ 16,384 ⳱ 0.000061 1 1 1 1 1 1 1 1 (recalling from Section 4.3 the fact that we multiply probabilities to determine the probability of events occurring together when the events are independent). With this very small probability, which is about 1 in 16,380, we would expect to wait on average over 315 years (16,380 weeks) before having an entire week of winners. Figuring Out Theoretical Binomial Probabilities The approach to finding the chance of there being seven winners in a trial can be further developed to find the chance that any specified number of winners will occur. There are three steps: 1. List all the length 7 win/loss sequences (outcomes) having the specified number of winners 2. Find the probability of each of those sequences of interest occurring. 3. Sum the probabilities found in step 2. This is legitimate because when several outcomes each make the event of interest occur, probability theory requires us to add these outcome probabilities to obtain the event’s probability. Let’s start with the simple example of flipping a fair coin twice, which is a simple case of a binomial experiment. The number of heads could be 0, 1, or 2. Thus we want the probability law for p(0 heads), p(1 head), p(2 heads). For each of these, we follow the above steps. Number of heads ⳱ 0: Only one sequence yields 0 heads: both flips are tails. Hence step 1 yields tails tails 00 The chance that both are tails is 冸 12 冹冸 12 冹, because the coin is fair and the flips are independent. So step 2 yields p(0 heads) ⳱ Probability of (tails,tails) ⳱ (probability of tails on first flip)(probability of tails on second flip) 1 1 1 ⳱ ⳱ 2 2 4 冢 冣冢 冣 Step 3 is to sum the probabilities, and since there is only one, the answer is 14 . Number of heads ⳱ 1: For step 1 we need the ways to get one heads and one tails. There are two possible sequences, or outcomes: heads tails tails heads 10 01 The chance of each of those two outcomes can be figured out as in the previous case, so step 2 gives us Probability of (heads,tails) ⳱ (probability of heads on first flip)(probability of tails on second flip) ⳱ 冢2冣冢2冣 ⳱ 4 1 1 1 and Probability of (tails,heads) ⳱ (probability of tails on first flip)(probability of heads on second flip) ⳱ 冢 冣冢 冣 1 2 1 1 ⳱ 2 4 Now add the probabilities to complete step 3, noting that the event of 1 head occurs if either of the outcomes (heads, tails) = (10) or (tails, heads) = (01) occurs: p(1 head) ⳱ 1 1 1 Ⳮ ⳱ 4 4 2 Number of heads ⳱ 2: Now both flips must be heads. Step 1 gives heads heads 11 Step 2 gives Probability of (heads,heads) ⳱ (probability of heads on first flip)(probability of heads on second flip) 1 1 1 ⳱ ⳱ 2 2 4 冢 冣冢 冣 In step 3 we get a sum of 14 . The distribution just derived is binomial with two individual trials in which the chance of a 1 (success) on any individual trial is 12 . Here is the summary: Number of successes Theoretical probability 1 4 1 2 1 4 0 1 2 ⳱ 0.25 ⳱ 0.50 ⳱ 0.25 Notice that the probabilities for all possible values of the statistic of interest add up to 1. This intuitive fact that the sum of the probabilities for all possible values of a random variable adds to one will be made rigorous in Chapter 14. But here we can say that when we list the probabilities of all the distinct possible outcomes of a probability experiment, these probabilities must sum to one. As a simple example, the sum of the six die probabilities of 16 is 1. Now we tackle a slightly more complicated example. Imagine playing the Illinois Instant Lottery game three times. Recall that the chance of a single ticket being a winner is 14 . The possible values of the statistic, the number of wins in three tries, are 0, 1, 2, and 3. We will go through the steps to find the probability of each. Number of wins ⳱ 0: Step 1: The only sequence is lose lose lose 000 The chance of a single ticket winning is 14 , so the chance of losing is 1 ⫺ 14 ⳱ 34 (because the complement of winning is losing). Step 2: The chance of three losers in a row is 3 3 3 27 ⳱ ⳱ 0.421875 4 4 4 64 冢 冣冢 冣冢 冣 Step 3: The answer is 27 64 . Number of wins ⳱ 1: Of the three tickets, one is a winner and two are losers. The winning one could either be the first, second, or third ticket. So for step 1 we have the three possible sequences: Win Lose Lose Lose Win Lose Lose Lose Win 100 010 001 Step 2 finds the chance of each sequence by multiplication: Chance of (win,lose,lose) ⳱ (chance of win on first) ⫻ (chance of loss on second) ⫻ (chance of loss on third) 1 3 3 9 ⳱ ⳱ ⳱ 0.140625 4 4 4 64 冢 冣冢 冣冢 冣 Similarly, 冢 冣冢 冣冢 冣 3 3 1 9 Chance of (lose,lose,win) ⳱ 冢 冣冢 冣冢 冣 ⳱ ⳱ 0.140625 4 4 4 64 Chance of (lose,win,lose) ⳱ 3 4 1 4 3 9 ⳱ ⳱ 0.140625 4 64 9 Notice that each sequence has the same probability, 64 . Step 3 adds the three probabilities: 9 9 9 27 Ⳮ Ⳮ ⳱ ⳱ 0.421875 64 64 64 64 Number of wins ⳱ 2: Now there are two wins and one loss. Step 1: Win Win Lose Win Lose Win Lose Win Win 110 101 011 Step 2: Chance of (win,win,lose) ⳱ (chance of win on first) ⫻ (chance of win on second) ⫻ (chance of loss on third) 1 1 3 3 ⳱ ⳱ ⳱ 0.046875 4 4 4 64 冢 冣冢 冣冢 冣 Continuing: Chance of (win,lose,win) ⳱ 冢4冣冢4冣冢4冣 ⳱ 64 ⳱ 0.046875 1 3 1 3 Chance of (lose,win,win) ⳱ 冢4冣冢4冣冢4冣 ⳱ 64 ⳱ 0.046875 3 1 1 3 Again, each sequence has the same probability, in this case add them up: 3 64 . Step 3 is to 3 3 3 9 Ⳮ Ⳮ ⳱ ⳱ 0.140625 64 64 64 64 Number of wins ⳱ 3: There is only one sequence with three wins. Step 1: Win Win Win 111 Step 2: Chance of (win,win,win) ⳱ 冢4冣冢4冣冢4冣 ⳱ 64 ⳱ 0.015625 1 1 1 1 1 Step 3: The answer is then 64 . There is not a very good chance that all three tickets will win. Now we can summarize the binomial distribution with three trials and with the chance of a 1 on an individual trial being 14 : Number of successes Theoretical probability 0 1 2 3 27/64 ⳱ 0.421875 27/64 ⳱ 0.421875 9/64 ⳱ 0.140625 1/64 ⳱ 0.015625 Add up the probabilities. Do you obtain 1? When the number of individual trials is large, writing out all the possibilities is tedious, but in principle it can be done. Table 8.1 showed the theoretical probabilities for seven trials for the case in which the probability of a 1 on a single trial is 0.25. Section 14.2 revisits the topic of the binomial distribution and gives a compact formula for theoretical binomial probabilities in general. For your convenience, an incomplete table of binomial probabilities based on this formula appears in Appendix G. Means and Standard Deviations An advantage to recognizing that a random experiment (viewable as sampling from a population) yields a statistic (random variable) with a known probability distribution, such as the binomial, is that many important characteristics of the experiment can be predicted without use of fivestep simulation. For example, theoretical probabilities of interest can often be obtained through tables or formulas (rather than the laborious fivestep method). In addition, there is usually a simple expression for the theoretical mean and the theoretical standard deviation, the most widely used measures of the center and spread of a distribution. Concerning the binomial distribution, in Chapter 11 confidence intervals that are useful to estimate the theoretical mean of a binomial distribution with a built-in measure of estimation accuracy are developed, and in Chapter 12 hypothesis tests to carry out decision making concerning the theoretical mean of a binomial distribution are developed. Below we discuss the theoretical mean and standard deviation of a binomial. In advanced statistics books that utilize calculus, general formulas for the mean and standard deviation of a distribution (population parameters: recall the discussion in Section 6.1) are given. Perhaps more important than having the ability to compute such theoretical (population) means and standard deviations, however, is having a clear understanding of their meaning and usefulness. Our five-step method and the notion of sampling from a distribution provide this. Consider simulating over and over the experiment of seven lottery plays with the number of wins being the statistic of interest. This repeated five-step simulation of the number of wins consists of repeatedly sampling from a binomial distribution having seven trials and success probability 1/4. Now imagine 100 trials of this experiment, conducted with the five-step method using seven draws with replacement from a box of 3 zeros and 1 one. As we learned in Chapter 4, which discussed expected value, the average of these 100 sampled statistics (numbers of wins) from the distribution will be close to the theoretical expected value, which is the distribution mean we need. Indeed, if more and more trials were carried out, the experimental average, or estimated expected value, would approach closer and closer to this theoretical mean. Thus, the real meaning of a theoretical mean of a distribution is that it is a parameter that indicates approximately where a random sample from a population having that distribution will center. “Center” here is measured by the sample mean of the random sample. Conversely, the sample mean of a large random sample from a distribution (population) indicates approximately where the unknown population’s mean lies. Similarly, the theoretical standard deviation indicates approximately the spread likely for a large random sample, while conversely the sample standard deviation of a large random sample indicates approximately the size of the population’s unknown standard deviation and hence the unknown size of the population’s spread. In our example, the population could be thought of as the result of playing the lottery seven times over and over indefinitely, each time recording the number of wins. In Section 14.4 we will learn a general mathematical formula for finding the theoretical expected value of a statistic and in particular for applying it to finding the population mean and standard deviation of various theoretical probability distributions. If you flip a fair coin 10 times, you would expect about five heads. In any binomial experiment, you would conjecture that the proportion of 1s will be equal to the probability any individual trial is a 1 times the number of trials. Indeed, in a binomial experiment, the theoretical mean number of 1s, or successes, is (Number of trials) ⫻ (probability of a 1 on an individual trial) In the lottery example above, suppose you buy a lottery ticket on each of seven days of the week and the chance of winning on any given day is 14 , Theoretical mean number of wins ⳱ (Number of days) ⫻ (probability of winning on a given day) 1 7 ⳱ 7⫻ ⳱ ⳱ 1.75 4 4 冢冣 You expect 1.75 winning tickets in a week. What can this mean? You know you cannot win exactly one and three-quarters times. It is important to realize that expected means that after repeating the experiment over and over (thus repeatedly sampling from a binomial distribution), the average number of wins per week would be around 1.75. (Recall Chapter 4, where we found such expected values empirically by the fivestep method. Here we are using the formula for the expected number of successes in a binomial experiment, which allows us to bypass the five-step method to precisely find the expected number of successes in a binomial experiment, which the five-step method only approximates.) In the 100 five-step method experiments summarized in Table 8.1, the actual average is 1.69: (10 ⫻ 0) Ⳮ (37 ⫻ 1) Ⳮ (32 ⫻ 2) Ⳮ (17 ⫻ 3) Ⳮ (3 ⫻ 4) Ⳮ (5 ⫻ 1) 169 ⳱ 100 100 ⳱ 1.69 This experimental expected value or mean, 1.69, is fairly close to the theoretical mean, 1.75, that it estimates. How far do the actual values of the statistic of interest (number of successes in 7 trials) from the individual experiments typically differ from 1.75? In most weeks the number of winners was 0, 1, 2, or 3, so you expect usually to be fairly close to the theoretical mean. How close can be measured by the theoretical standard deviation of the binomially distributed number of wins because it is a measure of the typical spread one expects of an observed statistic of interest from its theoretical mean. That is, being off by up to one standard deviation from the theoretical mean is a common occurrence. Indeed, obtaining a sampled statistic that is very close to the theoretical mean (relative to a distance away of one standard deviation) is quite unusual: variation is always present in random experiments! Crudely, we expect to be off from the mean by about one standard deviation, on average. For the data in Table 8.1, we estimate the theoretical standard deviation by the sample standard deviation (the square root of the sample variance of the data): square root 冦 100 [10(0 ⫺ 1.69) Ⳮ 37(1 ⫺ 1.69) Ⳮ 32(2 ⫺ 1.69) Ⳮ 17(3 ⫺ 1.69) Ⳮ 3(4 ⫺ 1.69) Ⳮ (5 ⫺ 1.69) ]冧 1 2 2 2 2 2 2 ⳱ 冪 105.39 ⳱ 1.0267 ⬇ 1 100 We express our conclusion about the typical variation in the sample mean by stating that the number of winners per week averaged 1.69, plus or minus the standard deviation, which is about 1. By “plus or minus about 1” we mean that the typical deviation from 1.69 is typically at most about 1 in magnitude and can be in either direction. The theoretical standard deviation of the number of wins is approximately the standard deviation for the number of wins you would obtain if you ran the experiment a large number of times. The theoretical standard deviation for a binomial has a convenient, if not particularly intuitive, formula: In a binomial experiment, the theoretical standard deviation of the number of 1s is square root[(number of trials) ⫻ (probability of a 1 on an individual trial) ⫻ (probability of a 0 on an individual trial)] In the lottery example, Theoretical standard deviation of the number of wins ⳱ square root[(number of days)(probability of winning on a given day) ⫻ (probability of losing on a given day)] ⳱ 冪 冢4冣 ⫻ 冢4冣 ⳱ 冪 16 ⳱ 冪1.3125 ⳱ 1.1456 7⫻ 1 3 21 It is close to the experimental value of 1.0267—not surprising, because with 100 trials we anticipate that the experimental or sample value will be close to its theoretical or population value. (This is of course our justification for using the five-step method with a reasonably large number of trials.) Flip a fair coin 100 times. You expect 50 heads 冸100 ⫻ 12 冹, and the theoretical standard deviation of the number heads is 冪 100 ⫻ 冢2冣 ⫻ 冢2冣 ⳱ 冪25 ⳱ 5 1 1 Thus it is not unusual to be off by up to 5 heads or so; that is, any number between 45 and 55 is reasonable. In fact, an important rule of thumb is that about 23 of the time we expect to be off from the theoretical mean by less than one standard deviation. In this precise sense, the size of the theoretical standard deviation gives us a quantitative interpretation of the “typical” variability in whatever is being observed—here the number of heads in 100 tosses. You would be surprised to get as few as 20 or 30, or as many as 70 or 80. The law of averages discussed in Example 4.3 tells us to expect heads in about half the tosses. If you flip a fair coin 1000 times, you would therefore anticipate that the number of heads would be around 500. How far off is reasonable? More than or less than 5? The answer lies in the theoretical standard deviation of the number of heads: 冪1000 ⫻ 冢2冣 ⫻ 冢2冣 ⳱ 冪250 ⳱ 15.81 1 1 Now from 484 to 516 is a likely number of heads—to be precise, an occurrence that should happen about 23 of the time if we toss a coin 1000 times. So the typical spread with 1000 flips is about three times that with 100 flips. The more flips, the larger the standard deviation. However, an interesting fact is that as a percentage of the number of flips, the standard deviation is actually less with 1000: 15.81 out of 1000 is 1.581%, while 5 out of 100 is 5%. Later this observation will be seen to be involved with the law of averages. The interested reader is encouraged to consult Section 14.2, where the formula for computing theoretical binomial probabilities for a specified number of trials and specified probability of a success (getting a 1) is given. SECTION 8.1 EXERCISES 1. A baseball team has won 60% of its games to this point in the season. Over the weekend it will play three games. Go through the four conditions for a binomial experiment. Why is the binomial distribution not appropriate for the number of games won on the weekend? 2. On a multiple-choice quiz, Joan guesses on each of five questions. There are four possible answers for each question. a. What is the probability that Joan guesses the first question correctly? b. Go through the four conditions for a binomial experiment. Verify that the number of correct answers out of five follows the binomial distribution. c. What is the probability that Joan guesses the correct answer on all five questions? 3. Return to the situation presented in Exercise 2. a. Write out the possible quiz outcomes in which Joan correctly guesses exactly two of the five questions. There should be 10 total possible outcomes. To get you started, here is one of the outcomes: Question number Guess 1 2 3 4 5 Right Right Wrong Wrong Wrong b. Now, next to each of the outcomes, write the probability of seeing that particular outcome. c. Using the table you have created, find the probability that Joan gets exactly two of the five questions correct on the quiz. 4. Give an example of a situation in which the binomial distribution could be used. 5. On each day of a certain week, there is a 20% chance that the bus will reach the bus stop late. Assume that each day is independent of the next. a. What is the probability that the bus reaches the bus stop on time on a certain day? b. What is the probability that the bus reaches the bus stop on time every day in a five-day work week? c. What is the probability that the bus reaches the bus stop on time exactly four days in the five-day work week? d. What is the theoretical mean for the number of times the bus will reach the bus stop on time in a five-day work week? What is the standard deviation? 6. For each of the following binomial distributions, give the theoretical mean and the theoretical standard deviation: a. b. c. d. n ⳱ 10; probability of a 1 is 0.90. n ⳱ 22; probability of a 1 is 0.35. n ⳱ 14; probability of a 1 is 0.12. n ⳱ 7; probability of a 1 is 0.47. 7. There are 100 poker chips in a bowl, 65 of which are blue and 35 red. You are to draw out a chip 50 times. After each draw, you replace the chip into the bowl (that is, there are 100 chips in the bowl prior to each draw). Hint: Does replacement make this a binomial probability problem? a. What is the theoretical expected value of the number of blue chips you will draw in 50 draws? b. What is the theoretical standard deviation of the number of blue chips you will draw in 50 draws? c. For each of the following values, give a subjective answer as to whether it is likely or not to draw that many blue chips in 50 draws: 18 or fewer; 29 or fewer. Base your decisions solely on the theoretical expected value and standard deviation you found in parts (a) and (b). Justify your answers. 8. You roll a fair, six-sided die six times. Calculate the following probabilities by writing out all the possible outcomes that result in the stated event and summing those probabilities. a. The probability that exactly one of the six tosses will come up a five b. The probability that exactly five of the six tosses will come up an even number c. The probability that all of the tosses will come up an odd number 9. A shopper goes to the grocery store and purchases a bag of 10 apples. Based on previous experience, he knows that 5% of the apples of this brand will be bruised and inedible. Assume that the apples are randomly assigned to the bag. What is the probability that in his bag he has either no bad apple or only one bad apple? (Hint: The probability that there is either one bad apple or none equals the sum of the probability that there is no bad apple and the probability that there is one bad apple.) For additional exercises, see page 726.