Survey

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Survey

Document related concepts

Transcript

Math 210G.M02, Fall 2016 Lecture 6: Combinatorial aspects of probability Part I: Binomial distributions An unfair coin. • A fair coin is one that comes up heads with a probability equal to ½ . How can we tell that a coin is fair? • Suppose that a coin is slightly unfair. For example, the probability it comes up heads is 49/100. How can we tell that the coin is slightly unfair? • These are not meaningless questions: they apply to a lot of questions in which precise likelihoods are important to know. Roulette Odds of all black or red All numbers 0-36 18 of them red, 18 black and one green Odds of all black: 18/37 Casino’s odds: 19/37 Casino’s advantage: 19:18. On average, for every 18 times it lands on black, it will not land on black 19 times. Better than a 5 percent advantage. • You would need to toss a coin a lot of times to determine it is unfair, if the odds of heads are 19/37. • • • • • • Binomial distributions refer to distributions of probabilities of outcomes for a sum of random variables each having one of two possible values or outcomes. • Example: flipping one coin: H or T • Count 1 for H and 0 for T. • Example: flipping N coins and adding up the number of H. • This sum can take any value between 0 and N. We are interested in the probability for each possible value. • Likelihood: how likely is it that a coin, flipped N times, and coming up heads M times, is a fair coin? • Similar questions arise in real life: • How likely is it that a coin is fair… • … or that a defendant is guilty, given the evidence presented in the trial • or that a drug is safe and effective, given results of clinical trials • Or that some law discriminates unfairly against some segment of the population, given pertinent data Probability versus Likelihood • Cardano’s classical definition of probability: If the total number of possible outcomes, all equally likely, associated with some actions is n and if m of those n result in the occurrence of some given event, then the probability of that event is m/n. • How do we know if certain events are equally likely? • In the case of a coin this can lead to circular reasoning. Is the coin fair? • A fair coin should land on Heads 50% of the time…in the long run. • A fair coin tossed once will either land on heads or on tails. • From one trial it is impossible to determine if the coin is fair. • A fair coin, tossed 100 times, might land on heads 60 times, but how likely is that to happen? Likelihood • Ocham’s razor says that we should use the simplest model that fits the data. • If a coin comes up 60 times in 100 trials, then the simplest hypothesis is that the coin is biased towards landing on heads. • However, suppose the coin looks exactly like other coins that, in our experience, are fair coins. • Now we have two sets of data: the results of the coin flips, and how the coin looks. • What is the simplest model that fits the data? What’s the lesson here? • In games of chance (poker, craps, etc) the problem is to apply combinatorial calculations to determine probabilities of given events • In other parts of life, the bigger problem is to try to find the most likely model that explains the data. • Often personal experience misleads us: I got a C in logic. My friend got a C in logic. It’s impossible to get an A in logic. • When do we have enough information? If you flip a penny 100 times, how many heads and tales do you expect? Coin flipping sites • Random.org: Click on games and coin flipper. Does not tally heads. • BTWaters: effective for looking at results of large numbers of coin flips. Posts historical data • Interactive coin tosser. Shows sequence of T/H What are reasonable numbers of heads and tails to expect? • Example: In the 2000 general election in the state of Florida, Gore obtained 2,912,253 and Bush obtained 2,912,790 votes. Total B&G votes: 5825043. • Difference: 537 • Exercise: flip a fair coin 5,825,043 times. How many times does it come up heads> Binomial distribution: • Independent events: the outcome (H,T) of the second coin does not depend on the outcome of the first. • Typical sequence of result of 10 flips: • HTTHTTTHTH • Given N fair coins, the probability of any given outcome sequence is • The probability of HTTHTTTHTH is • • • • This is the same as the probability of HHHHHHHHHH What does typical mean? Order matters What if order doesn’t matter? • • • • Two coins: the possible outcomes are: TT or TH or HT or HH Each with probability ¼ The probability of one head and one tail is equal to ½ since it can happen two different ways. Clicker question 1 • If you flip three coins, what is the probability that they all come up heads? A) ½ B) ¼ C) 1/8 D) 3/8 Clicker question 2 • If you flip three coins, what is the probability that exactly TWO of them come up heads? A) ½ B) ¼ C) 3/8 D) ¾ E) None of these Clicker question 3 • If you flip three coins, what is the probability that exactly ONE of them come up heads? A) ½ B) ¼ C) 3/8 D) ¾ E) None of these Choosing subsets • A set of N elements has 2^N subsets if we include the empty set and the whole set. • Think of the set a set of N coins and the “chosen” subset of the ones that will be heads. • Binomial coefficients Factorials Stirling’s approximation (Euler’s number) N choose k • N choose K equals… • N-1 choose K plus N-1 choose K-1 • The number of distinct ways in which to choose K elements from a set of N elements • Fix one element. If it is not chosen, all K must be from the remaining N-1. If it is chosen, the remaining K-1 must come out of the remaining N-1 N choose k Name_____________ Math 210, March 1, 2016 Name_____________ Fill in the next row of Pascal’s triangle Name_____________ Name______________ Group assignment • Please compute the next row of the table above. Turn in a sheet of paper with the title “The number of ways of choosing N elements from a set of 11 elements” the numbers from left to right, and the names of those in your group, and today’s date. You will be given 5 minutes for this. You are allowed to use your cell phone, but must turn off your phone when you are finished. Clicker question • How many ways are there to choose one element from a set of 11 elements? • A) 1 • B) 5 • C) 6 • D 11 Clicker question • How many ways are there to choose 5 distinct elements from a set of 11 distinct elements? N Choose K revisited •The number on the left is the same as “n choose k” •This formula is useful for computing the binomial coefficient n choose k when n is large. Example • 52 choose 5: number of to choose a 5 card poker hand from a set of 52 poker cards. • But =2,598,960 Quetelet’s graph Plotting Pascal’s triangle • The web page: http://www.ams.org/samplings/featurecolumn/fcarc-normal shows plots of the numbers in several rows of Pascal’s triangle. • For large row numbers, the row plots look like a bell-shaped curve Part II: Likelihood revisited Histograms Normal approximation to binomial Normal approximations, N=10, 100, 1000 Red curves give idealized normal approximation for a fair coin flipped N times. Blue histograms give probabilities of outcomes for biased N flips of a coin that has a 70% chance of landing on heads. Overlapping distributions • The distributions illustrate different probabilities. When two distributions have a lot of overlap, it is not clear whether an event should be associated with one distribution as opposed to the other. • Conversely, when two distributions are separated, the chance that an event will mistakenly be associated with the wrong one is very small. • For the fair coin problem, the distributions become more separated as the number of trials is increased. • We will see later that normal curves give a means to calculate overlaps and associate probabilities to them. Likelihood • In statistics, a likelihood function is a function of the parameters of a statistical model, defined as follows: the likelihood of a set of parameter values given some observed outcomes is equal to the probability of those observed outcomes given those parameter values. Likelihood is a function of the data. Law of Large numbers • The law of large numbers states that if X1, X2 ,…, Xn are independent samples of a random variable then the average value approaches the expected value as the number of trials tends to infinity. • In the case of a fair coin, counting 1 for heads and 0 for tails, the average value after a large number of trials should approach ½. BinomialNormal approximation • Flipping a coin n^2 times is the same as flipping n coins n times. • Let be the proportion of heads: the number of heads divided by n. • The expected value of X_n is the same as the expected value of X_{n^2} • But how does the variance of X_n depend on n? • Answer: Gambler’s ruin • The law of large numbers is sometimes misinterpreted as suggesting that if a coin comes up tails (or similar unfavorable event) occurs several consecutive times then the coin is more likely to come up heads the next time. This contradicts the hypothesis of independence. • The full central limit theorem indicates that as the sample size N increases, the distribution of the sample average of these binomial “random variables” approaches the normal distribution. • The central limit theorem was postulated by Abraham de Moivre who, in a remarkable article published in 1733, used the normal distribution to approximate the distribution of the number of heads resulting from many tosses of a fair coin. • This finding was ahead of its time, and nearly forgotten until Pierre-Simon Laplace rescued it from obscurity in his monumental work Théorie Analytique des Probabilités, published in 1812. Laplace expanded De Moivre's finding by approximating the binomial distribution with the normal distribution. • As with De Moivre, Laplace's finding received little attention in his own time. It was not until the nineteenth century was at an end that the importance of the central limit theorem was discerned, when, in 1901, Russian mathematician Aleksandr Lyapunov defined it in general terms and proved precisely how it worked mathematically. Nowadays, the central limit theorem is considered to be the unofficial sovereign of probability theory. Galton board illustrated Part III: More fun and games Second application: card games • 5 card poker hands • The number ways of choosing 5 cards from a set of 52 cards is “52 choose 5” • =2,598,960 Probabilities as proportions • Number of favorable outcomes divided by total number of possible outcomes • Chance of 4 of a kind: 13*48 out of 2,598,960 • 0.00024 • 240 out of a million Possible poker hands Straight flush Four of a kind Full house Flush (nonconsecutive) Straight (mixed) Three of a kind Two pairs One pair No pairs Total 40 624 3,744 5,108 10,200 54,912 123,552 1,098,240 1,302,540 2,598,960 How to figure… • The number of ways to get a straight… • Starting rank: 10 possible A,K,Q,J,10,9,8,7,6,5 • Number of ways from a given starting rank: 4x4x4x4x4 = 1024 • Total: 10,240 • Subtract straight flushes: 10,200 How to figure… The number of ways to get 3 of a kind… Rank: 13 possible Number of a given rank: “4 choose 3” = 4 Number of possibilities of remaining two cards that do not give a pair: 48x44/2 • Total: 13x4x48x22=54912 • • • • Problem • Show how to determine the number of ways in which to get a poker hand containing exactly a pair. Clicker question • Which 5 card poker hand has greater odds? A) Full house B) straight C) flush D) Two pair Clicker question • The number of distinct poker hands that have two pair but not three pair or higher: A) 127,920 B) 123,552 C) 1,098,240 D) 247,105 Name_____________ Math 210, March 3, 2016 Name_____________ Computing three card guts hands Name_____________ Name______________ • • • • • Three card guts is a poker game that involves three cards. Straights and flushes are not counted. The best possible hand is three of a kind. To do: figure out with your mates how many three card guts hands there are, and how many of them have a pair or better. Turn in: your solutions, todays date and names of those in your group. Number of possible hands is 52 choose 3. Compute this. Number of possible pairs: first compute the number of different ways to get two different ranks, thirteen choose two. Then compute the number of ways to get a pair in one rank and to get a card of the other rank. Multiply these.Then multiply by two since the pair can be at the higher rank or at the lower rank. Three card guts • • Number of possible hands is 52 choose 3. Compute this. Number of possible pairs: first compute the number of different ways to get two different ranks, thirteen choose two. Then compute the number of ways to get a pair in one rank and to get a card of the other rank. Multiply these.Then multiply by two since the pair can be at the higher rank or at the lower rank. Exercise 0 • A fair coin is flipped 10 times. • What is the probability that it will come up heads 5 times? • What is the probability that it will come up heads 6 time? • What is the probability that it will come up heads 7 times? • What is the probability that it will come up heads 8 times? Exercise 0: solution • A fair coin is flipped 10 times. There are 2^10=1024 possible outcomes. • The number of outcomes that result in 5 heads is the same as the number of ways of choosing 5 object (the coins that come up heads) from 10 objects, 10 choose 5 is 252. So the probability is p=252/1024 =0.2461 • 10 choose 6 is 210. p=210/1024= 0.2051 • 10 choose 7 = 120 so p=120/1024=0.1172 • 10 choose 8 = 45 so p=45/1024=0.0439 Exercise 1 • A fair coin is flipped 100 times. What is the probability that the coin lands on heads exactly 50 times? What is the probability that the coin lands on heads 51 or more times? What is the probability that the coin lands on heads 49 or fewer times? Exercise 1solution • A fair coin is flipped 100 times. What is the probability that the coin lands on heads exactly 50 times? • What is the probability that the coin lands on heads 51 or more times? What is the probability that the coin lands on heads 49 or fewer times? • The two cases are symmetric (landing on heads 51 or more times is the same as landing on tails 49 or fewer times and has equal probability of landing on heads 49 or few times. Therefore, the probability of landing on heads 51 or more times is 1 minus the probability of landing on heads 50 times, divided by two, or Exercise 2 • Compute the number 10 choose 2 and 10 choose 8. • Compute the numbers 10 choose 5, 10 choose 4 and 10 choose 6 • Compute the number 52 choose 47 and 52 choose 5 Exercise 2 solution • Compute the number 10 choose 2 and 10 choose 8. • Similarly, • Compute the numbers 10 choose 5, 10 choose 4 and 10 choose 6 • Compute the number 52 choose 47 and 52 choose 5 Exercise 3 • 4 kids want to play a game of two on two basketball. How many ways are there to divide the four players into two teams of two players each? • 10 kids want to play a game of 5 on 5. How many different ways are there of dividing the 10 players into two teams of 5 each? Exercise 3 solution • The number of ways to choose a team with two players is 4 choose 2 which is 6. However, once one team of two is chosen, the other team is also set so half of these 4 choose 32 choices are the other team. So there are actually 6/2=3 ways. • The number of ways to choose of ateam of 5 players from 10 kids is 10 choose 5 which is 242. However, once one team is chosen, the other team is automatically set, so half of these choices correspond to the other team. Hence there are actually 242/2=121 ways. Exercise 4 • How many distinct three card guts hands are there? • How many three card guts hands contain three of a kind? • How many three card guts hands contain a pair but not three of a kind? • How many three card guts hands do not contain any pairs? Exercise 4 solution • The number of distinct three car guts hands is 52 choose 3 which is 52x51x50/3x2x1=22,100 • In a three card hand, there are 4 choose 3=4 ways to get three of a given rank, and there are 13 ranks so there are 4x13=52 ways to get three of a kind. • There are 13 choose 2 =78 ways to get two ranks. There are 4 choose 2 = 6 ways to get a pair of a given rank and there are 4 ways to get a card of a second rank. We have to consider that the pair can be at the higher or lower rank so there are • 78x6x4x2=3744 such hands • How many three card guts hands do not contain any pairs? Subtract 22100-52-3744=18304 Exercise 5 • Explain the difference between a likelihood and a probability Exercise 5 solution • Explain the difference between a likelihood and a probability • Probability refers to the probability of an event and assumes that certain outcomes of an event are equally likely, for example, if a coin is assumed to be fair then its probability of landing on heads or tails is one half. • The term likelihood refers technically to values of parameters. For example, suppose that I coin a priori has an unknown probability of landing on heads. One then estimates the likelihood that it is a fair coin (or has another probability of landing on heads) from data such as tossing it a given number of times and counting the frequency that it lands on heads.