* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download FPP13_15
		                    
		                    
								Survey							
                            
		                
		                
                            
                            
								Document related concepts							
                        
                        
                    
						
						
							Transcript						
					
					Probability FPP 13-15 1 Probability  What statisticians hang their hat on  Provides a formal framework from which uncertainty can be quantified  Why study probability in an intro stat course?  Lay foundations for statistical inference.  Train your brain to think in a way that it is not hardwired to do  Its quite enjoyable and relaxing 2 Types of probability  What exactly is probability?  There are any number of notions of probability, indicating that probability isn’t a thing but a concept  We can spend a semester philosophizing about probability if you are interested I can direct you to some books.  An unexhausted list      Laplacian probability Hypothetical limiting relative frequency probability Nomic probability Fiducial probability Epistemic probability  In this class we will focus on two of these. 3 Terminology  Sample Space: The set (collection) of all possible outcomes that can happen.  Event: A single outcome or set of outcomes from a sample space  Probability Model: A consistent assignment of a probability to each even in the sample space  Disjoint Events: Two events that have no outcomes in commone and, thus, cannot both occur simultaneously  Venn Diagrams help visualize the above 4 Heads up  Mathematical notation will become a little more prevalent here. You’ll need to put forth effort wrapping your brain around it. 5 Limiting relative frequency  Most folks call this the frequentist approach 1. Operations: observation, measurement, or selection that can at least 2. 3. hypothetically be repeated an infinite number of times Sample space: set of possible outcomes of an operation Events: subsets of elements in the sample space  Elements of the sample space (basic outcomes) are equally likely  Calculation 1. Let S denote the sample space, E ⊂ S denote an event, and |A| denote 2. the size of any set A P r(E) ≡ |E|/|S |  Upshot  Percentage of times an event occurs in repeated realizations of random processes 6 Epistemic probability  Often times called “subjective” probability  This term is a bit loaded as it can be argued that “objective” probability doesn’t really exist  Here probability is degree of belief in likelihood of event  Belief is updated or modified in the light of observed information 7 Probability  Why consider two probabilities  Each allows different approaches to incorporating probability in an anslysis  Each one leads to different types of inference statements.  Is one preferable to the other?  This really depends on who you ask.  There have been (heated) discussions on the appropriateness of both 8 Notation we will regularly use 9 Frequency probability  We focus first on how to use frequency probability in an analysis and will cover epistemic probability later  Simple motivating example  There are 3 red balls and 9 white balls in a hat  Pick one ball at random out of the hat  Once picked the ball is not replaced  Then pick another ball at random out of the hat 10 Shorthand for probability  Define R1 = pick a red ball on the 1st try  Define R2 = pick a red ball on the 2nd try  Define W1 = pick a white ball on the 1st try  Define W2 = pick a white ball on the 2nd try  Probability of picking a red ball on the 1st try is  Pr(R1) =  Probability of picking two red balls in two picks without replacing 1st ball is  Pr(R1 and R2) = 11 Marginal and joint probability  Probability of a single event is called marginal probability  Example: Pr(R1)  Probability of intersection of two events (both events happening) is called a joint probability  Example: Pr(R1 and R2) 12 Conditional probability  Say we pick a red ball on the 1st try. The chance we pick a red ball on the 2nd try equals 2/11.  Probability that an event given another event occurs is called conditional probability  Shorthand: Pr(R2|R1) = 2/11  “Probability that R2 occurs given that R1 occurs.” 13 Relating these probabilities  Pr(R1 and R2) = Pr(R1)Pr(R2|R1)  6/132 = 3/12(2/11)  Joint prob. = marginal prob. times conditional prob.  This is always true 14 Independent events  Replace 1st ball before picking the 2nd . Then  Pr(R1) = 3/12  Pr(R2 | R1) = Pr(R2) = 3/12  R1 and R2 are called independent events: The occurrence of R1 does not affect the probability of R2. 15 Independent events  When events are independent calculating joint probabilities is fairly easy  Let events A, B, C, … etc. be independent  Pr(A and B and C and … etc. ) = Pr(A)Pr(B)Pr(C)Pr(etc.)  To get joint probabilities you can simply multiply the marginal probabilities  Why does this work? 16 Dependent events  Notice that when sampling with out replacement then  Pr(R2|R1) = 2/11 ≠ 3/12 = Pr(R2)  When the conditional prob. is not equal to the marginal prob. then the events are said to be dependent.  The occurrence of R1 affects the probability of R2  Here R1 and R2 are dependent events 17 Dependent events  When events are dependent joint probabilities are harder to compute  Let A, B, C, …, etc. be dependent events  Pr(A and B and C and … etc.) = Pr(A|B,C,etc.)Pr(B|C, etc.)Pr(C|etc.)Pr(etc.)  To get joint probabilities, you multiply all the conditional probabilities 18 Independence in sports  Baseball announcers sometimes say, “The batter has not gotten a base hit in the last four times he’s batted. He’s due for a hit now”  What is this statement assuming? 19 “or” rule  This is an inclusive “or”  That is, A or B = A or B or both.  Pr(A or B) = Pr(A) + Pr(B) – Pr(A and B) 20 “or” rule  Pr(R1 or R2 ) = Pr(R1) + Pr(R2) – Pr(R1 and R2) = Pr(R1) + Pr(R2) – Pr(R2|R1)Pr(R1) = 3/12 + Pr(R2) – (2/11)(3/12) We will come back to Pr(R2) in a couple of slides 21 “or” rule  If events are disjoint (i.e. they cannot happen simultaneously) then we can split “or” probabilities into sums of individual probabilities  These are also referred to as “mutually exclusive” events  Pr(W1 or R1) = Pr(W1) + Pr(R1) - Pr(W1 and R1) = Pr(W1) + Pr(R1) - 0 22 Law of Total of Probability  Law of total probability. For any set B  P(A) = P(A and B) + P(A and not B)  P(Brown eyes)=P(Brown eyes and Male)+P(Brown eyes and Female)  R2 occurs in two ways Red picked 1st and red picked 2nd OR 2. While picked 1st and red picked 2nd 1. Pr(R2) = Pr(R2 and R1) + Pr(R2 and W1)   23 Notice the W1 implies not R1 (if a white is drawn on the first draw then you can’t get a red on the first draw). “or”  Can we compute Pr(drawing at least one red ball) ?  Pr(R1 and W2 or W1 and R2 or R1 and R2) = Pr(R1 and W2) + Pr(W1 and R2) + Pr(R1 and R2) = (3/12)(9/11) + (9/12)(3/11) + (3/12)(2/11) = 60/132  Sometimes it is easier to compute the probability of “compliments” of events  Pr(drawing at least on red ball) = 1- Pr(no red balls)  1-Pr(no red balls) = 1- Pr(W1 and W2) = 1-(9/12)(8/11)=60/132 24 A common confusion  What’s the difference between mutually exclusive and independent?  When do I add and when do I multiply?  Two events are mutually exclusive if the occurrence of one prevents the other from happening  Under mutual exclusivity  Pr(A or B) = Pr(A) + Pr(B)  Two events are independent if the occurrence of one does not change the chances of the other  Under independence  Pr(A and B) = Pr(A)Pr(B)  Pr(A|B) = Pr(A) 25 Coin toss  A coin is tossed six times  Two possible sequences are  Sequence 1 H T T H T H  Sequence 2 H H H H H H  Which of the following is correct?  Sequence 1 is more likely  Sequence 2 is more likely  Both the sequences are equally likely 26 Example  Box A has 30 red and 20 blue marbles  Box B has 3 red and 2 blue marbles  Which box, if either, offers the better chance of winning in each of the the three scenarios below? Pick one marble. You win if it is red 2. Pick two marbles (without replacement). You win if at least one is red 3. Pick three marbles (without replacement). You win if at least one is red 1. 27 Example  One ticket will be drawn at random from each of the two boxes shown below 1 2 3 1 2 3 4  Find the chance that  The number drawn from the left box is larger than the right  The number drawn from the left equals the one on the right 28 Example  Three cards are dealt from a standard 52 card deck.  What is the chance that the first card is a King?  What is the chance that the second card is a Queen?  What is the chance that the third card is a Jack?  What is the chance that the first card is a King and the second card is a Queen and the third a Jack?  Five cards are dealt from a standard deck. What is the chance that the first cards are aces and the fifth card a king? 29 Example  A 10-sided die is rolled three times. What is the chance of getting at least one roll with a number bigger than seven 30 Example  True or False  A fair 6-sided die is rolled three times. The chance of getting at least one ace equals 1/6 + 1/6 + 1/6 = 1/2  If a coin is tossed twice, the chance of getting at least one head is 50% 31 Let’s make a deal  Game show host, Monte Hall, presents 3 doors. Behind one      32 door is a fabulous prize behind the other two doors are a sweet pig and goat Monte knows what is behind each door You pick a door Monte opens the door with the goat or pig (but not the door you picked) Monte then asks if you want to switch Should you switch?????? 33 Let’s make a deal revisited  In general can’t answer the question “what is the probability of winning if I switch, given that I have been shown a goat behind door three.”  One must be very explicit about assumptions being made on what Monte Hall’s strategies are  Will he ever reveal the door hiding a car?  Two solutions  When interested in the unconditional probability  Here always want to switch  When interested in the conditional probability  Here can do no better than switching depending on strategy 34 Birthday problem  What is the chance that at least two people in your stats 101 lab section share the same birthday? 35 Birthday problem  Case study: There have been 44 U.S. Presidents  Common birth dates  Nov. 2: Harding and Polk  Common death dates  July 4: Adams, Jefferson, and Monroe  March 8: Fillmore and Taft  Dec. 26: Truman and Ford 36 Pistols at dawn 37  Tom Cruise, Nicole Kidman, and Penelope Cruz have gotten into a disagreement about who has the best hair and decide to settle their dispute the only way hair disputes really can be settled: with a three-cornered pistol duel. Of the three, Tom is the worst shot, hitting his target only 30% of the time. Nicole is a markswoman; she never misses her target. Penelope spent a lot of time near the gun shop on Hillsborough Road , so she's had some practice and can hit targets 50% of the time.  The rules of the duel are simple: they are to fire at targets of their choice in succession, and cyclically, in the order Tom, Nicole, Penelope, Tom, Nicole, Penelope, and so on until only one of them is left standing. On each turn, they get only one shot. If a combatant is hit, he or she no longer participates, either as a shooter or as a target. For example, one possible outcome of the duel is for Tom to shoot at Nicole and hit; then for Penelope to shoot at Tom and miss; then for Tom to shoot at Penelope and miss; then for Penelope to shoot at Tom and hit. Then, Penelope wins.  Assume that each person is trying to maximize his chance of survival. For example, if Nicole has to choose between shooting at Tom and shooting at Penelope, Nicole will shoot at Penelope because Penelope is the more accurate shooter.  Put yourself in Tom's shoes. You have three strategies to choose from: 1) shoot at Nicole with the intention of hitting her; 2) shoot at Penelope with the intention of hitting her; and 3) shoot at no one so that you have no chance of hitting anyone. Which of these three strategies maximizes Tom's probability of survival? 38 Sports playoffs  In playoffs in many sports, the first of two teams to win four games is the winner  Wins do not have to be consecutive  The series ends after one team wins four games  How likely is it that a series will last  four games?  Five games?  Six games?  Seven games? 39 Sports playoffs  For context say the two teams are the Celtics and the Lakers in the 2009 NBA finals  Assume that the outcome of each game is independent of prior games .(is this reasonable?)  For all games Pr(C) = 0.52 and Pr(L) = 0.48 40 Bayes Rule  Not in book  Recall that P(B and A)=P(B)P(A|B) = P(A)P(B|A)=P(A and B).  After some rearranging we get P( A) P( B | A) P( A | B)  P( B) 41 Bayes rule  A common blood test for AIDS is the EIA. When AIDS antibodies present, EIA reports AIDS 99.85% of time. When AIDS antibodies not present, EIA reports no AIDS 99.4% of time.  It is estimated that about 900,000 out of the 280,000,000 people in the U.S. have AIDS.  A person takes the EIA test and it reports that he has AIDS. How likely is he to have AIDS? 42 43 EIA example  A convenient way to find Pr(A|P)  Make a table of a hypothetical population, and fill in class of table using given information  Assume 10,000 people in population  Number with AIDS: (900,000/(280,000,000)*10,000 = 32.143  Number with out AIDS 10,000 – 32.14 = 9,967.857 44 Hypothetical table for EIA AIDS Positive Not Positive 45 Not AIDS EIA test: Final Table AIDS Not AIDS Positive Not Positive 32.143 46 9,967.857 10,000 EIA test: Final Table AIDS Not AIDS Positive 32.095 Not Positive 0.048 32.143 47 9,967.857 10,000 EIA test: Final table AIDS 48 Not AIDS Positive 32.095 59.807 91.902 Not Positive 0.048 9,908.050 9,908.098 32.143 9,967.857 10,000 EIA test: Final answer  Pr(A and P) = 32.095/10,000  Pr(P) = 91.902/10,000  Pr(A|P) = Pr(A and P)/Pr(P) = 32.095/91.902 = 0.3492  There is a 34.92% chance the person has AIDS give he tested positive on EIA 49 Sensitivity to initial marginal probability  What if 1% of people have AIDS  Pr(A|P) = 0.627  What if 10% of people have AIDS  Pr(A|P) = 0.949  The probability is very sensitive to the incidence of rate of AIDS 50 Binomial distribution  Let n be a sequence of binary outcome events (e.g. success/fail)  Let p be the probability of success  What is the probability of getting x successes out of n trials  This is a binomial probability if  p remains constant for all trials  n (the number of trials) is fixed  Pr(x) = choose(n,x)px(1-p)(n-x)  Choose(n,x) = n!/(x!(n-x)!) 51 Binomial distribution  Roll a standard dice ten times and count the number of sixes.  Consider the outcome of getting a six as a success a failure otherwise  The number of trials is fixed at 10  The probability of success (1/6) is the same for all trials  Let X denote the number of sixes  Pr(X=1) = choose(10,1)(1/6)1(5/6)10-1 = 0.32  Pr(X=5) = choose(10,5)(1/6)5(5/6)5-5= 0.013  The binomial distribution is useful to compute probabilities under the “sampling with replacement” scheme 52
 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                            