Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Binomial Distributions Chapter 5.3 – Probability Distributions and Predictions Mathematics of Data Management (Nelson) MDM 4U Authors: Gary Greer (with K. Myers) Our Problem… suppose students either like math or they don’t suppose 5% of students like math if you had 300 students, how likely would it be that 20 of them liked math? this can be modeled as a binomial distribution in statistics it is important in looking at how likely a situation is to have occurred randomly if it is very unlikely to have occurred, it lends support to the significance of a finding Binomial Experiments a binomial experiment is any experiment that has the following properties: there are n identical trials there are two possible outcomes for each trial, termed success and failure the probability of success is p and the probability of failure is 1-p the probabilities remain constant from trial to trial the trials are independent repeated trials which are independent and have 2 possible outcomes (success/failure) are called Bernoulli Trials Bernoulli? Jakob Bernoulli (Basel, December 27, 1654 August 16, 1705) Swiss Mathematician one of the great names in probability theory one of a family of great minds in a variety of subjects Binomial Distributions in a binomial experiment the number of successes in n repeated Bernoulli Trials is a discrete random variable (usually called X) X is termed a binomial random variable and its probability distribution is called a binomial distribution the following formula provides a method of solving highly complex situations involving probability Binomial Probability Distribution consider a binomial experiment in which there are n Bernoulli trials, each with a probability of success of p the probability of k successes in the n trials is given by: n k nk P( X k ) p 1 p k Example 1 Consider a game where a coin is flipped 5 times. You win the game if you get exactly 3 heads. What is the probability of winning? we will let heads be a success 3 53 n=5 5 1 1 P X 3 1 p=½ 2 3 2 k=3 3 2 5 1 1 1 10 10 2 2 2 10 5 32 16 Example 1 continued suppose the game is changed so that you win if you get at least 3 heads what is the probability of winning now? P X 3 P ( X 3) P( X 4) P ( X 5) 5 5 1 16 4 2 5 5 1 16 32 32 4 1 5 1 2 5 2 1 2 5 The Batting Example the Expected Value of a binomial experiment that consists of n Bernoulli trials with a probability of success, p, on each trial is E(X) = n(p) Example: Consider a baseball player who has a batting average of 0.292 this means that his probability of getting a hit each time he is at bat is 0.292 let a hit be a success where p = 0.292 a. What is the probability of no hits in the next 5 at bats? p 0.292 , n 5, k 0 5 0 5 P X 0 0.292 0.708 0 110.178 0.178 so there is a 0.178 probabilit y that there will be no hits in 5 times at bat b. What is the probability of 2 hits in the next 8 at bats? p 0.292 , n 8, k 2 8 2 6 P X 2 0.292 0.708 2 28 0.085 0.126 0.300 so there is a 0.300 probabilit y that there will be 2 hits in 8 times at bat c. What is the probability of at least 1 hit in the next 10 at bats? p 0.292 , n 10 P X 1 1 P X 1 1 P X 0 10 0 10 1 0.292 0.708 0 1 110.032 0.968 so there is a 0.968 probabilit y that there will be at least 1 hit in 10 times at bat d. What is the expected number of hits in the next 10 at bats? E(X) = n(p) E(X) = (10)(0.292) = 2.92 → 3 therefore the player can expect to get 3 hits in the next 10 at bats Exercises / Homework Homework: page 299 #1, 3, 7, 8, 9, 10, 11, 12 Normal Approximation of the Binomial Distribution Chapter 5.4 – Probability Distributions and Predictions Mathematics of Data Management (Nelson) MDM 4U Authors: Gary Greer (with K. Myers) Recall… the probability of k successes in n trials (where p is the probability of success) is n k nk P( X k ) p 1 p k this formula can only be used if we have a binomial distribution: each trial is identical the outcomes are either success or failure This calculation is easy in simple cases… find the probability of 30 heads in 50 trials P(30 heads in 50 trials ) 50 30 0.5 1 0.55030 0.042 30 so there is about a 4.2% chance however, if we wanted to find out the probability of tossing between 20 and 30 heads in 50 trial, we would need to perform at least 10 of these calculations there is an easier way however Graphing the Binomial Distribution If the distribution is normal, we can solve complex problems in the same way we did in the last chapter the question is: is the binomial distribution a normal one? it turns out that if the number of trials is relatively large, the binomial distribution approximates a normal curve What does it look like? when graphed the distribution of probabilities of head looks like this what will the mean be? what will the standard deviation be? Line Scatter Plot 0.12 0.10 probability Binomial Distribution 0.08 0.06 0.04 0.02 0.00 0 5 10 15 20 25 30 35 40 45 50 55 heads So how do we work with all this it turns out that a binomial distribution can be approximated by a normal distribution if: n(p) > 5 and n(1 – p) > 5 if this is the case, the distribution is approximated by the normal distribution N ( x , 2 ) where x np and np(1 p) But doesn’t a normal curve represent continuous data and a binomial distribution represent discrete data? Yes! so to use a normal approximation we have to consider a range of values rather than specific discrete values for example the range of continuous values between 4.5 and 5.5 can be represented by the discrete value 5 Example 1 Tossing a coin 50 times, what is the probability that you will get tails less than 20 times let success be tails, so n = 50 and p = 0.5 now we can find the mean and the standard deviation x 50 (0.5) 25 50 (0.5)(1 0.5) 12 .5 3.54 Example 1 continued we will consider 0-19.5 (values below 20) times, and use it to calculate a z-score z = 19.5 – 25 = -1.55 3.54 therefore P(X < 19.5) = P(z < -1.55) = 0.0606 there is a 6% chance of less than 20 tails in 50 attempts In terms of the normal curve, it looks like this all the values less than 19.5 are found in the shaded area 19.5 25.0 Example 2 Two dice are rolled and the sum recorded 40 times. What is the probability that a sum greater than 6 occurs in at least half of the trials? let p be the probability of getting a sum greater than 6 p = 6/36 + 5/36 + 4/36 + 3/36 + 2/36 + 1/36 p = 7/12 now we can do some calculations Example 2 continued 7 np 40 23 .3 5 12 P( x 20 ) ? 5 n(1 p) 40 16 .6 5 12 x np 23. 3 np(1 - p) 9.72 3.118 20 .5 23 .3 z 0.91 0.8186 3.118 the probability of getting a sum greater than 6 on at least half of the trials is 82% Example 3 you have a drawer with one blue mitten, one red mitten, one pink mitten and one green mitten if you closed your eyes and picked a mitten at random 200 times (with replacement) what is the probability of choosing the pink mitten between 50 and 60 times? so, success is considered to be drawing a pink mitten, with n = 200 and p = 0.25 Example 3 Continued check to see whether the normal approximation can be used np = 200(0.25) = 50 n(1 – p) = 200(0.75) = 150 since both of these are greater than 5 the binomial distribution can be approximated by the normal curve now find the mean and standard deviation Example 3 Continued x np 200 (0.25 ) 50 np1 p 200 (0.25)( 0.75 ) 37 .5 6.124 49 .5 50 First Case z 0.081 0.4681 6.124 60 .5 50 Second Case z 1.715 0.9564 6.124 the probability of having between 50 and 60 pink mittens drawn is 0.9564 – 0.4681 = 0.4883 or about 49% Exercises / Homework Read the example on page 310 do Page 311 # 4-10