Download Chapter 5

Chapter 5 The Binomial Distribution The binomial distribution is a very useful distribution for studying situations where the information is available in a ``yes'' or ``no'' form. Such information is typical of market research studies. ``Do you prefer product A or product B?'' ``Would you buy this product if the price was $5.00?'' Another application is election polls. ``If the election were held tomorrow, would you vote for Smith or Jones?'' A response for Smith could be considered a yes and a response for Jones a no. Another business application that is becoming increasingly important is in the quality control. A machine is producing parts. We are interested in the proportion of defective parts (a defective part could be considered a ``yes'') the machine produces as opposed to the proportion of good parts (a good part considered a ``no''). The binomial distribution arises from binomial experiments. These experiments have the following characteristics: 1. The experiment consists of n distinct trials 2. Each trial must result in one of two possible outcomes. Designate one of these outcomes as a success, S, and the other as a failure, F. 3. Let the probability of a success be P(S)=p and P(F)=q. Successes and failures are complements so p+q=1, also p and q do not change from trial to trial. 4. The trials are independent. That is the probability of a success on the third trial, for example, does not depend on the outcome of the previous trials. 5. The RV X= number of S's in n trials Let's conduct an experiment which consists of tossing a single coin $n=20$ times and see if it is a binomial experiment. First of all you can easily distinguish the results of the trials and know how many of them there are, so (1) is met. If we call getting a head a success (S) and getting a tail a failure (F) then it is clear that only S's and F's can occur and that one and only one of them will result on any toss, so (2) is met. Heads and tails are complements and the probability of getting a head or getting a tail won't change from toss to toss, so (3) is met. The probability of getting a head on any toss should not depend on the results of any previous trial, so (4) is met. The final condition, (5), is met because we can count the number of heads (S's). So this is a binomial experiment. Consider a second experiment. An urn is filled with 10 red balls and 10 black balls. Two balls are drawn from the urn. Call a getting a red ball a success. Is this binomial? Conditions (1), (2), and (5) seem to be met. Conditions (3) and (4) depend on whether sampling with replacement (putting each ball back in the urn after it is drawn) or sampling without replacement (the balls are not placed back after each draw). If sampling with replacement is used, the experiment is binomial. If sampling without replacement is used the experiment is not binomial. Condition (3) is not met because both p and q change from trial to trial. The probability of getting a red ball (S) on the first draw is P(S)=10/20. Suppose you get a S on draw one. Then on draw 2, P(S)=9/19. On the other hand if you get a F on draw one the probability of getting a S on draw 2 is P(S)=10/19. So p and q change from trail to trial. Also the probability a getting a success on the second trial depend on the results of the first trial. This means the trials are not independent so Condition 4 is violated. There is a probability distribution which can describe sampling without replacement, the hypergeometric distribution. Suppose a market survey is being conducted. Is this binomial? In one sense it is like the urn problem unless sampling with replacement is done (each person could be questioned more than once --if they will put up with it). But if a very small sample is taken from a very large population, the change in p and q would be so small that we could ignore it. The trials would likely be independent in this case also. But consider the following: the researcher asks two friends standing side by side what they think of a certain product. Do you think one friends response might affect the others response? The binomial formula is P( X  x)  C xn p x q ( n x ) where x is the number of successes in n trials, n-x is the number of failures, p is the probability of a success on any one trial, and q is the probability of a failure on any one trial. Further, C xn  n! x!(n  x)! . Example:1. Suppose you were to toss 2 coins (n=2) Call getting a head a success(S). Suppose the probability of getting a head on either coin is 50% (p=q=0.5). Find the probability that you will get no heads (X=0), the probability that you will get exactly one head (X=1), and the probability that you will get 2 heads (x=2). For P(X=0) p  0.5, q  0.5, n  2, X  0 Cxn  n! 2! 12 2    1 x ! n  x  ! 0! 2  0 ! 11 2  2 P( X  0)  C02 p x q n  x  1 0.5  (0.5) 2  1 1 0.25  0.25 0 For P(X=1) p  0.5, q  0.5, n  2, X  1 Cxn  n! 2! 12 2    2 x ! n  x  ! 1! 2  1! 11  1 P( X  1)  C12 p x q n  x  1 0.5  (0.5)1  2 0.5 0.5  0.5 1 and for P(X=2) p  0.5, q  0.5, n  2, X  2 Cxn  n! 2! 12 2    1 x ! n  x  ! 2! 2  2 ! 1 2 1 2 P( X  2)  C22 p x q n  x  1 0.5  (0.5)0  1 0.25 1  0.25 2 Example 2. We could also consider tossing a coin 10 times and finding the probability that we would get exactly 3 heads out of the ten tosses. p  0.5, q  0.5, n  10, X  3 Cxn  n! 10! 3628800    120 x ! n  x ! 3!10  3! 6  5040  P( X  3)  C310 p x q n  x  120  0.5 (0.5)7  120 0.125 0.0078  0.1172 3 Example 3 If you were to go to Las Vegas and play a game like roulette with fairly even odds (I think you have about a 49% chance of winning), then the probability that you will win exactly 5 games in 10 trials is p  0.49, q  0.51, n  10, X  5 Cxn  n! 10! 3628800    252 x ! n  x  ! 5!10  5 ! 120 120  P( X  5)  C510 p x q n  x  252  0.49  (0.51)5  252 0.02825 0.03450  0.2456 5 It should be clear that it will be difficult to use the formula if n is very large. Excel can be used in that case Probability Formula Probability Formula P(X=5) P(X<=5) Excel command Binodist(5,10,0.5,false) Name Probability mass function Binodist(5,10,0,5,true) Cumulative distribution function Some statistical terminology. P( X  x) is called the probability mass function. P( X  x)  P( x  0)  P( X  1)    P( X  x) is called the cumulative distribution function. That is it accumulates (sums) a number of mass function values. In Excel P( X  x) binodist(x,number of trials, p, false) P ( X  x) Binodist(x,number of trials, p, true) P( X  x)  1  P( X  x) 1-Binodist(x,number of trials, p, true) where p is the probability of success on any one trial. Example 4. Suppose a fair coin (call a Success S is getting a head, the P(S) = p = 0.5) is tossed 10 times. (a) find the probability that you will get exactly 5 heads in the ten tosses (b) find the probability that you will get 5 or less heads in the ten tosses (c) tosses. find the probability that you will get between 3 and 5 heads in the ten Example 5. Work Example 3 using Excel. Find the probability that you would win 5 out of 10 games in Las Vegas if the odds on winning any one game is 49% Example 7. We can find the probability that you will lose money (if you win 4 or fewer times), break even (if you win 5 times) or win money (if you win 6 or more times). . Example 8. It is interesting to note what happens if you extend the play at the game. Suppose you play 100 times rather than 10. You will lose money if you win 49 or fewer games, break even if you win 50 games, and win money if you win 51 or more games. Moral: if you go to Las Vegas, the longer you play the more you lose. The Mean and Standard Deviation of the Binomial Distribution The mean of the binomial distribution is given by   E ( x)  np where the notation E(x) is the expected value (which is just another word for mean. Some people like to say mean, others like to say expected value). Suppose you played a game where you tossed a coin 10 times where each toss had a 50% chance of coming up heads. Repeat this game a larger number of times recording X, the number of successes for each game, and then averaged the results for all of the games. The average then is the expected value. For this case   E ( x)  np  10(0.5)  5 The variance and standard deviation of the binomial distribution is given by  2  npq   npq and for the game mentioned above  2  npq  10(0.5)(0.5)  2.5   2.5  1.58 The use of the mean and standard deviation will become clearer in the next chapter. At this point we can note that we could use them in Chebychev’s Theorem. Example 9. Suppose we were to toss a coin n=1000 with p=0.5 where we say a success occurs if a head shows. Find the mean and variance of the probability distribution. Then use Chebychev’s theorem with k=2 place limits on the distribution.   np  1000(0.5)  500  2  npq  1000(0.5)(0.5)  250 \[   npq  250  15.81 Then using Chebychev’s theorem P    2  X    2   1  1 k2 P  500  2 15.81  X  500  2 15.81   1  P  468.38468  X  531.62   P  468  X  532   75% 1 22 3 4 Suppose we repeated this experiment a large number of times. Chebychev's Theorem says that at least 75% of the time the number of heads will be between 468 and 532. Problems 1. Let the probability of a success on any one trial be p = 0.1. Suppose n = 3 trials are conducted. Find (a) the probability that there will be no successes (b) the probability that there will be exactly 1 success (c) the probability that there will be exactly 2 success (d) the probability that there will be exactly 3 success (e) the probability that there will be exactly 1 failure [P(X=2)] (f) the probability that there will be exactly 2 failures [P(X=1)] 2. . Suppose the probability of having a good blind date is 0.20 and that you have 5 blind dates during the semester. What is (a) the probability that you will have exactly 4 good blind dates (b) the probability that you will have exactly 3 bad blind dates (c) 3 Suppose the probability that a student entering NAU as a freshman will graduate in four years is p=.3. Suppose 1000 new freshmen enter NAU this fall. Use Chebychev's Theorem with k=2 to find limits on the numbers that will four years hence. Repeat using k=3. 4. Suppose that 10% of all market research surveys are returned. If a company sends out n=10,000 surveys set limits on the number they might expect to have returned. To do this use Chebychev's Theorem with k=2 and k=3. Answers 1.Use the Excel results to check your calculations 2. Use the Excel results to check your calculations

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Chapter 5