Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
The Binomial Distribution A motivating example… • 35% of Canadian university students work more than 20 hours/week in jobs not related to their studies. This can have a serious impact on their grades. What is the probability that I have at least 5 such students in this class? Answer: There is better than a 99% chance! What is a Binomial Distribution? •Any random statistic that can be cast in a “yes/no” format where: •N successive choices are independent •“yes” has probability p and “no” has probability 1-p fits a binomial distribution. Suggest 3 other examples of data sets that can be modeled as binomial distributions Looking a bit deeper… • Suppose someone offered you the following “game”: Toss a coin 5 times. If you get 3 heads I pay you a dollar, otherwise you pay me 50 cents. • Should you accept the bet? • What is your expected return on this bet? • How can we calculate the odds? Pascal to the rescue! There are exactly 10 ways to get 3 heads What is the probability of flipping 6 tails in 8 trials? How to generate Pascal’s Triangle •Pascal’s triangle “unlocks” the mystery of binomial distributions •The cells in the triangle represent binomial coefficients which also represent all possible “yes/no” combinations •In “math-speak” we use the following notation to calculate the number of ways “k” events can occur in “n” choices: n n! k k !(n k )! Factorial notation 5! = 5x4x3x2x1 = 120 How many ways can 3 people be selected from a class of 39? Math detail (FYI) • The general binomial probability is: n k P(k ) p (1 p)nk k Example: B(9,0.4),what is P(5)? • The Binomial Table is built from these terms How to use the binomial distribution • Assign “yes” and “no” and their respective probabilities to the instances in your problem •Assign “n” and “k” and either use the formula, look up in a table or use a stats package (Excel works well) •Example: 5.5 Look up in table 3 ways: Use formula Use Excel 15 P(3) (0.3)3 (0.7)12 0.1700 3 From Binomial to Normal Distributions • Binomial is a discrete probability distribution • Normal is a continuous distribution • When n becomes very large we can often approximate by using a N(m,s) dist. m X np s X np(1 p) • How large is “large”? Rule of Thumb: when np >= 10 and n(1-p) >= 10 we can use the Normal Distribution approximation Sample Proportions… • We often are interested in knowing the proportion of a population that exhibits a specific property (statistic). We denote this the following way: count of successes X pˆ size of sample n • p is a proportion (often interpreted as a probability) and is therefore a number between 0 and 1 Mean and Standard Deviation of a Sample Proportion • If p is the proportion of “successes” in a large SRS of n samples, then: m pˆ p s pˆ p(1 p) n Look at Example 5.7 Working through some examples… • 5.19: ESP • A) ¼ = 0.25 • B) p(10)+p(11)+…+p(20) or… 1- [p(0)+…p(9)], this can be read from Table C or done in EXCEL • C) use m X np s X np(1 p) • You would expect 5 correct choices with a standard deviation of 1.936 • D) Since the subject knows that all 5 of the shapes are on the card the choices are no longer random and hence a binomial model is not appropriate – this was not the case in parts a-c • 5.21 • A) just use m X np • B) now use: m pˆ p s pˆ • C) z ( pˆ 0.24) s X np(1 p) p(1 p) n 0.24 0.2 3.16 0.01265 • D) p = 0.01 z = 2.33, use z X m s ; X m s z Odds on the Oil! • In order to make the play-offs, the Oilers must win 12 of their remaining 17 games. What is the probability that they will be successful? They currently have won 33 of the past 63 games. • Step 1: re-word as a binomial distribution question, identify “n” and “k” • Decide on what probabilities you will need to calculate • Use either tables, Minitab or EXCEL Odds on the Oil! – normal approximation • Let’s look at using the normal approximation to solve this: • In order to make the playoffs the oilers must have a better winning average than 33/60! • However, at their current rate, how many of the 17 games do you expect them to win? What’s the standard deviation of this? • Determine a z-score from this and comment on the likelihood of the Oiler’s success. • Look at the sub-section “continuity correction” on pg 379 to help answer this. • Should we expect this to give a reasonable answer? • 5.24 • Identify relevant statistics: n = 1500, p = 0.7 • • • • A) X = np = (1500)(0.70) = 1050 B) z = (1000-1050)/17.748, better than 99% chance C) z = (1200-1050)/17.748, NO CHANCE!!!!! D) X = np = 1190 and s = 18.89, chance that more than 1200 accept is now pretty good (p = 0.2892) In conlusion… • Be sure that you understand what a binomial distribution is and when it can be applied • Be able to use the probability equation on page 382 • Know how to read and apply a binomial probability table (Appendix C) • Know what Pascal’s triangle is and how it relates to binomial distributions • Be able to relate the binomial distribtion to the normal distribution and when you can approximate with a normal distribution z-score analysis