Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
STATISTICS SPECIAL DISCRETE DISTRIBUTIONS UNIT PLAN In the last unit we looked at the general form of a discrete probability distribution. What it is, how to find its measures, etc. In this section we will look at four special discrete distributions that show up in a number of applications. We are already quite familiar with one of these, the “C over C” or “Hypergeometric” distribution. The other three will be new to us. The “good news” about these distributions is that they all have special formulas for their mean and variance. That is, we will no longer need the 𝑥 ∙ 𝑃(𝑥) and 𝑑2 ∙ 𝑃(𝑥) columns! This will save us a good deal of work. The main challenge in this unit will be to look at a situation and determine which of the four special distributions applies to it. The four distributions we will be studying are: 1.) Binomial 2.) Hypergeometric 3.) Geometric 4.) Poisson The first two are finite, and the last two are infinite. Infinite distributions sound scary, but we can tame them with the complement rule. We are also going to replace our old notation 𝑃(𝐸) for the probability of an event and 𝑃(𝐸 ′ ) for the probability of its complement. Instead we will just use 𝑝 and 𝑞 respectively for these purposes. The formulas will get quite clunky otherwise. I. Binomial Distributions A distribution is called binomial if and only if: 1.) Given a sample space Ω and a random variable 𝑥 constructed over that space, an outcome is in instance 𝑖 if and only if it is an outcome where a result specified in the definition of 𝑥 occurred exactly 𝑖 times. 2.) The sample space in question was constructed by drawing with replacement from a single selection set. (and thus all single-draw events in the sample space are independent of each other.) The probabilities associated with such a random variable have the following generating function: 𝑷(𝒙) = 𝒑𝒙 𝒒𝒓−𝒙 𝒓𝑪𝒙 Where p is the probability of any given draw resulting in an outcome in the Event in question, and q being the complement of p. (𝑞 = 1 − 𝑝). This generating function is called the Binomial Theorem. The special formulas for mean and variance are: 𝝁𝒙 = 𝒓𝒑 and 𝝈𝟐 = 𝒓𝒑𝒒 [Recall to get standard deviation you must take the square root of variance.] Warning: A lot of textbooks use n where I am using r, but I want to reserve n for the number of items in the selection set in this context. Where does the “C” come from? Something that confuses a lot of people about the Binomial Theorem is that, even though it is constructed on a sample space with replacement, it somehow involves combinations, which have to do with drawing without replacement. We will examine this confusing fact in the context of our first example. Example 1: Let’s take the vacation in Paris example from the previous section. I said since we didn’t know the Binomial Theorem, we were limited to analyzing a two-day vacation. But now, let’s re-write the problem so our vacation lasts five days! 40% of all days in Paris have some sunshine. Each day’s weather is independent of the weather on any other day. Suppose you are now taking a five-day vacation to Paris. Let x = number of days during your vacation in which there is some sunshine. Do the following a.) Specify the probabilities of all instances of x. b.) Find the expected number of days with sunshine during your vacation, and the associated standard deviation. First of all, is x binomial? Remember, it has to meet our two criteria. Does it meet criterion #1? Yes, because a result is specified (a day with sunshine), and in the sample space of possible vacations, a vacation is in instance x only if had sunshine that number of days. Does it meet criterion #2? Yes, the weather on any given day is independent of the weather on any other given day. This is equivalent to constructing a sample space with replacement. Now we can answer part a.) of the example by making a table using the generating function, as shown to the right. 𝑥 𝑃(𝑥) = 0.4𝑥 0.65−𝑥 5𝐶𝑥 0 𝑃(0) = 0.40 0.65 5𝐶0 = 0.0778 1 𝑃(1) = 0.41 0.64 5𝐶1 = 0.2592 2 𝑃(2) = 0.42 0.63 5𝐶2 = 0.3456 3 𝑃(3) = 0.43 0.62 5𝐶3 = 0.2304 4 𝑃(4) = 0.44 0.61 5𝐶4 = 0.0768 And standard deviation then is √1.2 = 1.0954 5 𝑃(5) = 0.45 0.60 5𝐶5 = 0.0102 So now that we’re done with that, on to the question everyone’s asking: “Where does that C come from??” ∑ 1 To answer part b.), we’re used to extending this table, adding two more columns, one to find mean (expected value), and one to find variance. However, since x is binomial, there are known formulas which do the work for us! No need for more columns!! 𝜇𝑥 = 𝑟𝑝 = 5 ∙ 0.4 = 2 𝜎 2 = 𝑟𝑝𝑞 = 5 ∙ 0.4 ∙ 0.6 = 1.2 Let’s consider instance 𝑥 = 1 : What outcomes are in this instance? All of the outcomes where one day has sunshine and the rest do not. Let’s call these results 𝑠 and 𝑛. How many outcomes are in this instance? Let’s list them: {𝑠1 , 𝑛2 , 𝑛3, 𝑛4, 𝑛5 } {𝑛1 , 𝑠2 , 𝑛3, 𝑛4, 𝑛5 } {𝑛1 , 𝑛2 , 𝑠3, 𝑛4, 𝑛5 } {𝑛1 , 𝑛2 , 𝑛3, 𝑠4, 𝑛5 } and {𝑛1 , 𝑛2 , 𝑛3, 𝑛4, 𝑠5 } . As you can see, there are five outcomes in instance 𝑥 = 1. The probability of each one occurring is (0.4)1 (0.6)4 . And, not coincidentally, 5𝐶1 = 5. Therefore, the total probability of instance 𝑥 = 1 is 0.41 0.64 5𝐶1 = 0.2592. You can repeat this exercise for the remaining instances if you want. Each time, the number of outcomes in each instance 𝑥 will equal 5𝐶𝑥 . Why does this happen? Because the way x was defined, I just wanted x days with sunshine. I didn’t specify that I wanted the first x day to have sunshine and the remainder not to have sunshine. So, in other words, even though the sample space was constructed with replacement, in my definition of x, I don’t care about order. (That is, the instances themselves were constructed without replacement – if I want exactly two days with sunshine, once those days have been given to specific draws, the remaining three draws must be days without sunshine.) Another point of confusion here is that since 𝑛𝑃𝑟 ≥ 𝑛𝐶𝑟 , students think if I specified that order does matter in my definition of x, then the probabilities of each instance would increase. This isn’t true – rather, if I specified that order did matter, there would no longer be just 5 instances of x, but rather 32 [see appendix 1 for the list]. To see this, realize that each of the five outcomes that were in instance x = 1 above would be in separate instances. You also might see the logical contradiction in such a statement – you can’t increase the probability of all instances without the sum exceeding 1, which isn’t allowed. Appendix 1 The 32 instances of x if order did matter: {𝑛1 , 𝑛2 , 𝑛3, 𝑛4, 𝑛5 } {𝑛1 , 𝑛2 , 𝑠3, 𝑠4, 𝑛5 } {𝑛1 , 𝑠2 , 𝑠3, 𝑠4, 𝑠5 } {𝑠1 , 𝑛2 , 𝑛3, 𝑛4, 𝑛5 } {𝑛1 , 𝑛2 , 𝑠3, 𝑛4, 𝑠5 } {𝑠1 , 𝑛2 , 𝑠3, 𝑠4, 𝑠5 } {𝑛1 , 𝑠2 , 𝑛3, 𝑛4, 𝑛5 } {𝑛1 , 𝑛2 , 𝑛3, 𝑠4, 𝑠5 } {𝑠1 , 𝑠2 , 𝑛3, 𝑠4, 𝑠5 } {𝑛1 , 𝑛2 , 𝑠3, 𝑛4, 𝑛5 } {𝑛1 , 𝑛2 , 𝑠3, 𝑠4, 𝑠5 } {𝑠1 , 𝑠2 , 𝑠3, 𝑛4, 𝑠5 } {𝑛1 , 𝑛2 , 𝑛3, 𝑠4, 𝑛5 } {𝑛1 , 𝑠2 , 𝑛3, 𝑠4, 𝑠5 } {𝑠1 , 𝑠2 , 𝑠3, 𝑠4, 𝑛5 } {𝑛1 , 𝑛2 , 𝑛3, 𝑛4, 𝑠5 } {𝑛1 , 𝑠2 , 𝑠3, 𝑛4, 𝑠5 } {𝑠1 , 𝑠2 , 𝑠3, 𝑠4, 𝑠5 } {𝑠1 , 𝑠2 , 𝑛3, 𝑛4, 𝑛5 } {𝑛1 , 𝑠2 , 𝑠3, 𝑠4, 𝑛5 } {𝑠1 , 𝑛2 , 𝑠3, 𝑛4, 𝑛5 } {𝑠1 , 𝑛2 , 𝑛3, 𝑠4, 𝑠5 } {𝑠1 , 𝑛2 , 𝑛3, 𝑠4, 𝑛5 } {𝑠1 , 𝑛2 , 𝑠3, 𝑛4, 𝑠5 } {𝑠1 , 𝑛2 , 𝑛3, 𝑛4, 𝑠5 } {𝑠1 , 𝑛2 , 𝑠3, 𝑠4, 𝑛5 } {𝑛1 , 𝑠2 , 𝑠3, 𝑛4, 𝑛5 } {𝑠1 , 𝑠2 , 𝑛3, 𝑛4, 𝑠5 } {𝑛1 , 𝑠2 , 𝑛3, 𝑠4, 𝑛5 } {𝑠1 , 𝑠2 , 𝑛3, 𝑠4, 𝑛5 } {𝑛1 , 𝑠2 , 𝑛3, 𝑛4, 𝑠5 } {𝑠1 , 𝑠2 , 𝑠3, 𝑛4, 𝑛5 } A further question arises – are the probabilities of these 32 instances found by a generating function that includes multiplying by a Permutation factor? The answer is no. The reason is, each of these 32 instances represents a particular Permutation, so there’s no need to multiply by the number of Permutations in each instance, since each instance contains only one Permutation. (You might also be able to see that there’s no sensible input for the second input in the Permutation expression, because x is no longer a random variable as such, but rather a random set of variables!) Since one of the requirements for the Binomial Theorem is that x be a random variable, we can now conclude confidently that any formulation of x where order does matter disqualifies the distribution from being Binomial! By the way, the generating function for such a distribution would be 𝑃({𝑥1 , 𝑥2 , 𝑥3, 𝑥4, 𝑥5 }) = 𝑝|𝑠| 𝑞 |𝑛| , where |𝑠| and |𝑛| are the number of times 𝑠 and 𝑛 occur in the input set, respectively. Obviously we are now very far afield from the Binomial Theorem, so hopefully you see why the Theorem has the form that it does and not one of the other forms that some students think that it should have! HW: p. 194-195 #9 – 17; #19 – 22 part a’s only 2.) Hypergeometric Distribution – We are already familiar with these, these are the “C over C” problems we have been working with for some time. A distribution is considered Hypergeometric if and only if: 1.) Given a sample space Ω and a random variable 𝑥 constructed over that space, an outcome is in instance 𝑖 if and only if it is an outcome where a result specified in the definition of 𝑥 occurred exactly 𝑖 times. 2.) The sample space in question was constructed by drawing without replacement from a single selection set. The probabilities associated with such a random variable have the following generating function: 𝑃(𝑥) = 𝑛𝑘 𝐶𝑥 ∙ (𝑛−𝑛𝑘 )𝐶(𝑟−𝑥) 𝑛𝐶𝑟 Where 𝑛𝑘 is the number of items in a given category in the selection set. And the formulas for mean and variance are: 𝜇= 𝑟∙𝑛𝑘 𝑛 and 𝜎2 = 𝑟∙𝑛𝑘 ∙(𝑛−𝑛𝑘 )(𝑛−𝑟) (𝑛3 −𝑛2 ) Ex: Let’s revisit the “Lobster and Crabs” problem from the last section. In that problem, we had 13 crustaceans in a trap. 8 were lobsters and the rest (5) were giant crabs. We defined a random variable 𝑥 to be the number of lobsters selected when three crustaceans were pulled from this trap at random without replacement. That is, 𝑛 = 13 𝑛𝑘 = 8 and 𝑟 = 3 . Let’s reconstruct the distribution using the formal generating function and see if we don’t get the same result: 𝑥 𝑃(𝑥) = 0 1 2 3 ∑ 0.0350 0.2797 0.4895 0.1958 1 (check) 8𝐶𝑥 ∙ 5𝐶(3−𝑥) 13𝐶3 Now before when we did this problem, we laboriously constructed mean and variance using separate columns. Now let’s see if the special formulas don’t serve us better: 𝜇 = 3 ∙ 8⁄24 = 1.8462 and 𝜎 2 = 3 ∙ 8 ∙ (13 − 8)(13 − 3) ⁄(133 − 132 ) = 0.5917 and then to get standard deviation we would take the square root 0.5917 of that to get 0.7692 . Check back to Worksheet #4 from the previous unit. The numbers match! HW: p. 204 #24 – plus find the mean and standard deviation of 𝑥 , where 𝑥 is the number of defective microchips. Interlude: Finite Sums of Infinite Sequences So far we have been dealing with finite sample spaces – they may have been very large, in the millions or even billions of outcomes, but they were finite. In this section we will examine non-finite sample spaces and random variables constructed on them. Even though these spaces are non-finite, their total Probability still = 1. How is this possible? Consider the following sequence. 𝑆= 1 1 1 1 1 + + + + … 2 4 8 16 32 S is a sequence that is unbounded, with the nth term given by ∞ 𝑆=∑ 𝑛=1 1 2𝑛 . We could therefore write 1 2𝑛 I will now claim something quite surprising: 𝑆 = 1 . That’s right: The sum of an unbounded sequence is a finite number. Indeed, it is 1, so we can consider each n to be the name of an outcome in a sample space, with the corresponding term in S being its probability! 1 How do I prove such a surprising claim? It is actually quite simple. Consider 2 𝑆. By the good old Distribution Rule from Algebra, 1 1 1 1 1 1 𝑆= + + + + + ⋯. 2 4 8 16 32 64 1 2 Now let’s consider what happens when we subtract 𝑆 from 𝑆… 𝑆 1 1 1 1 1 1 + + + + + +⋯ 2 4 8 16 32 64 1 − 𝑆 2 −1 1 1 1 1 − − − − 4 8 16 32 64 1 1 As you can see, all but the first term cancels. Since we know that 𝑆 − 2 𝑆 = 2 𝑆, we can conclude from this that 1 1 𝑆= 2 2 And therefore: 𝑆=1 So now that you believe me that such sample spaces can exist, let’s look at two of the more common distributions that can be constructed over such sets: The Geometric and Poisson distributions. 3.) Geometric Distributions A random variable x is said to be Geometric if and only if: 1.) It is constructed over an unbounded sample space (this means an unbounded number of draws!), where all single-draw events are independent of each other. 2.) Each instance of x represents the set of outcomes in which the result specified in x occurred for the first time on the xth draw. An important implication of 2.) is that the Geometric Distribution has no zero row. This is because a result can’t occur for the first time on the “0th” trial. That makes no sense. The generating function for the Geometric is: 𝑃(𝑥) = 𝑝 ∙ 𝑞 𝑥−1 Where p and q are defined as they were for the binomial distribution. Also like the binomial, Geometric distributions have known formulae for Expected Value and Variance: 𝐸(𝑥) = 1 𝑞 and 𝜎𝑥 2 = 2 𝑝 𝑝 So there’s no need for the “big tables” that started Chapter 4. Examples: p. 198 Example #1. p. 199 Try it Yourself #1 HW: p. 202 #1-4; p. 203 #15-18; p. 204 #25,26 4.) Poisson Distributions A random variable x is said to be Poisson if and only if: 1.) Each instance of x represents the number of occurrences that take place within a given unit of measure. (Usually a unit of time and/or space), and 2.) These occurrences appear randomly, but with a known mean number of occurrences 𝜇 for the given unit. Obviously Poisson Distributions are very different from the kinds we have studied before. Also, unlike the Binomial and Geometric Distributions, whose generating functions were pretty obvious once you saw them derived, the generating function for the Poisson is impossible to derive without knowledge of some advanced Calculus. Therefore, this next formula you will have to take “on faith” – or study Calc for a few years, whichever you prefer The generating function is: 𝑃(𝑥) = 𝜇 𝑥 𝑒 −𝜇 𝑥! Where “e” is the mathematical constant. You can find a button for “e” on most calculators. On the standard blue calculators it is accessed by pressing [2nd] and then the [ln] on the upper left-hand side. Notice that this button automatically opens a power for you, which is nice. Like Binomial and Geometric, Poisson Distributions also have known formulas for Expected Value and Variance: 𝐸(𝑥) = 𝜇 and 𝜎 2 = 𝜇 This odd result comes from the strange properties of “e,” which could fill a whole lesson in and of themselves! Examples: p. 199-200 Example #2, p. 200 Try It Yourself #2, 3 HW: p.202 #5-8, p. 203 #19-22, p. 204 #27-28 Distinguishing Between Distributions If you ever hope to use Statistics in a career, you have to learn how to distinguish between different kinds of distributions. In the following questions, try to determine if the distribution in question is Binomial, Geometric or Poisson. Exercises: p. 202 #9-14 Follow with SPD Worksheets, SPD Test