Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Fundamentals of Discrete Probability Chapter 2 Summer 2003 IS 601 Ö. S. Benli This chapter introduces • the fundamental tools to model an uncertain event and calculate its probability, • the concepts of an outcome, an event, and the probability of an event, • the laws of probability and use of these laws to compute probabilities Summer 2003 IS 601 Ö. S. Benli • the use of probability tables to facilitate the calculation of probabilities, • the concept of random variable and the probability distribution of a random variable, • the concepts of mean, variance, and standard deviation, as summary measures of the probability distribution, • the joint probability distribution of a collection of random variables, and Summer 2003 IS 601 Ö. S. Benli • the summary measures of covariance and correlation, which measure the interdependence between two random variables. Summer 2003 IS 601 Ö. S. Benli Outcomes, probabilities, and events • Outcomes of an experiment are the events that might possibly happen. • Outcomes are – Mutually exclusive: no two outcomes can occur together – Collectively exhaustive: at least one must occur Summer 2003 IS 601 Ö. S. Benli • The probability of an outcome is the likelihood that the outcome will occur when the uncertainty is resolved. • An event is a collection of outcomes. Summer 2003 IS 601 Ö. S. Benli Use of probability to model • situations where we simply lack information, or • a naturally random process. Summer 2003 IS 601 Ö. S. Benli Are probabilities frequencies? P(coin toss yields heads) = ½ Summer 2003 IS 601 Ö. S. Benli Are probabilities frequencies? • P(The Iliad was written by Homer) = 0.95 • P(a piece of equipment aboard the space shuttle fails) = 0.00000001 Summer 2003 IS 601 Ö. S. Benli A patient is admitted to the hospital and a potentially life-saving drug is administered. The following dialog takes place between the nurse and a concerned relative. RELATIVE: Nurse, what is the probability that the drug will work? NURSE: I hope it works, we’ll know tomorrow. RELATIVE: Yes, but what is the probability that it will? NURSE: Each case is different, we have to wait. Summer 2003 IS 601 Ö. S. Benli RELATIVE: But let’s see, out of a hundred patients that are treated under similar conditions, how many times would you expect it to work? NURSE (somewhat annoyed): I told you, every person is different, for some it works, for some it doesn’t. RELATIVE (insisting): Then tell me, if you had to bet whether it will work or not, which side of the bet would you take? NURSE (cheering up for a moment): I’d bet it will work. Summer 2003 IS 601 Ö. S. Benli RELATIVE (somewhat relieved): OK, now, would you be willing to lose two dollars if it doesn’t work, and gain one dollar if it does? NURSE (exasperated): What a sick thought! You are wasting my time! from Bertsekas Summer 2003 IS 601 Ö. S. Benli The goal is to develop methods that will lead to consistent and systematic ways to measure randomness by calculating probabilities of uncertain events based on sound intuition. Summer 2003 IS 601 Ö. S. Benli ♦K ♦Q ♣K ♣Q ♥K ♥Q ♠K ♠Q ♦J ♣J ♥J ♠J ♦ 10 ♣ 10 ♥ 10 ♠ 10 ♦9 ♣9 ♥9 ♠9 ♦8 ♣8 ♥8 ♠8 ♦7 ♣7 ♥7 ♠7 ♦6 ♣6 ♥6 ♠6 ♦5 ♣5 ♥5 ♠5 ♦4 ♣4 ♥4 ♠4 ♦3 ♣3 ♥3 ♠3 ♦2 ♣2 ♥2 ♠2 Summer 2003 IS 601 ♣A ♥A ♠A ♦A Ö. S. Benli U: “Universe”; entire sample space ♦K ♦Q ♦J ♦ 10 ♦9 ♦8 ♦7 ♦6 ♦5 ♦4 ♦3 ♦2 Summer 2003 IS 601 ♦ A ♣K ♣Q ♣J ♣ 10 ♣9 ♣8 ♣7 ♣6 ♣5 ♣4 ♣3 ♣2 ♣A Ö. S. ♥K ♥Q ♥J ♥ 10 ♥9 ♥8 ♥7 ♥6 ♥5 ♥4 ♥3 ♥2 Benli ♥A ♠K ♠Q ♠J ♠ 10 ♠9 ♠8 ♠7 ♠6 ♠5 ♠4 ♠3 ♠2 ♠A A: all “nine”s; B: face cards ♦K ♦Q ♦J ♦ 10 ♦9 ♦8 ♦7 ♦6 ♦5 ♦4 ♦3 ♦2 Summer 2003 IS 601 ♦ A ♣K ♣Q ♣J ♣ 10 ♣9 ♣8 ♣7 ♣6 ♣5 ♣4 ♣3 ♣2 ♣A Ö. S. ♥K ♥Q ♥J ♥ 10 ♥9 ♥8 ♥7 ♥6 ♥5 ♥4 ♥3 ♥2 Benli ♥A ♠K ♠Q ♠J ♠ 10 ♠9 ♠8 ♠7 ♠6 ♠5 ♠4 ♠3 ♠2 ♠A A: diamonds; B: face cards ♦K ♦Q ♦J ♦ 10 ♦9 ♦8 ♦7 ♦6 ♦5 ♦4 ♦3 ♦2 Summer 2003 IS 601 ♦A ♣K ♣Q ♣J ♣ 10 ♣9 ♣8 ♣7 ♣6 ♣5 ♣4 ♣3 ♣2 ♣A Ö. S. ♥K ♥Q ♥J ♥ 10 ♥9 ♥8 ♥7 ♥6 ♥5 ♥4 ♥3 ♥2 Benli ♥A ♠K ♠Q ♠J ♠ 10 ♠9 ♠8 ♠7 ♠6 ♠5 ♠4 ♠3 ♠2 ♠A A: “red”; B: “black” ♦K ♦Q ♦J ♦ 10 ♦9 ♦8 ♦7 ♦6 ♦5 ♦4 ♦3 ♦2 Summer 2003 IS 601 ♦A ♣K ♣Q ♣J ♣ 10 ♣9 ♣8 ♣7 ♣6 ♣5 ♣4 ♣3 ♣2 ♣A Ö. S. ♥K ♥Q ♥J ♥ 10 ♥9 ♥8 ♥7 ♥6 ♥5 ♥4 ♥3 ♥2 Benli ♥A ♠K ♠Q ♠J ♠ 10 ♠9 ♠8 ♠7 ♠6 ♠5 ♠4 ♠3 ♠2 ♠A A: King of Diamonds; B: all others ♦K ♦Q ♦J ♦ 10 ♦9 ♦8 ♦7 ♦6 ♦5 ♦4 ♦3 ♦2 Summer 2003 IS 601 ♦A ♣K ♣Q ♣J ♣ 10 ♣9 ♣8 ♣7 ♣6 ♣5 ♣4 ♣3 ♣2 ♣A Ö. S. ♥K ♥Q ♥J ♥ 10 ♥9 ♥8 ♥7 ♥6 ♥5 ♥4 ♥3 ♥2 Benli ♥A ♠K ♠Q ♠J ♠ 10 ♠9 ♠8 ♠7 ♠6 ♠5 ♠4 ♠3 ♠2 ♠A The first law of probability The probability of any event is a number between zero and one. A larger probability corresponds to the intuitive notion of greater likelihood. An event whose associated probability is 1.0 is virtually certain to occur; an event whose associated probability is 0.0 is virtually certain not to occur. Summer 2003 IS 601 Ö. S. Benli P(U: “Universe”) =1 ♦K ♦Q ♦J ♦ 10 ♦9 ♦8 ♦7 ♦6 ♦5 ♦4 ♦3 ♦2 Summer 2003 IS 601 ♦ A ♣K ♣Q ♣J ♣ 10 ♣9 ♣8 ♣7 ♣6 ♣5 ♣4 ♣3 ♣2 ♣A Ö. S. ♥K ♥Q ♥J ♥ 10 ♥9 ♥8 ♥7 ♥6 ♥5 ♥4 ♥3 ♥2 Benli ♥A ♠K ♠Q ♠J ♠ 10 ♠9 ♠8 ♠7 ♠6 ♠5 ♠4 ♠3 ♠2 ♠A The second law of probability If A and B are mutually exclusive events, then P(A or B) = P(A) + P(B) Summer 2003 IS 601 Ö. S. Benli P([A:“nine”s] OR [B: “face”s]) ♦K ♦Q ♦J ♦ 10 ♦9 ♦8 ♦7 ♦6 ♦5 ♦4 ♦3 ♦2 Summer 2003 IS 601 ♦ A ♣K ♣Q ♣J ♣ 10 ♣9 ♣8 ♣7 ♣6 ♣5 ♣4 ♣3 ♣2 ♣A Ö. S. ♥K ♥Q ♥J ♥ 10 ♥9 ♥8 ♥7 ♥6 ♥5 ♥4 ♥3 ♥2 Benli ♥A ♠K ♠Q ♠J ♠ 10 ♠9 ♠8 ♠7 ♠6 ♠5 ♠4 ♠3 ♠2 ♠A P(A: “red”) + P(B: “black”) = 1 ♦K ♦Q ♦J ♦ 10 ♦9 ♦8 ♦7 ♦6 ♦5 ♦4 ♦3 ♦2 Summer 2003 IS 601 ♦A ♣K ♣Q ♣J ♣ 10 ♣9 ♣8 ♣7 ♣6 ♣5 ♣4 ♣3 ♣2 ♣A Ö. S. ♥K ♥Q ♥J ♥ 10 ♥9 ♥8 ♥7 ♥6 ♥5 ♥4 ♥3 ♥2 Benli ♥A ♠K ♠Q ♠J ♠ 10 ♠9 ♠8 ♠7 ♠6 ♠5 ♠4 ♠3 ♠2 ♠A P(A: King of Diamonds) + P(B: all others) = 1 ♦K ♦Q ♦J ♦ 10 ♦9 ♦8 ♦7 ♦6 ♦5 ♦4 ♦3 ♦2 Summer 2003 IS 601 ♦A ♣K ♣Q ♣J ♣ 10 ♣9 ♣8 ♣7 ♣6 ♣5 ♣4 ♣3 ♣2 ♣A Ö. S. ♥K ♥Q ♥J ♥ 10 ♥9 ♥8 ♥7 ♥6 ♥5 ♥4 ♥3 ♥2 Benli ♥A ♠K ♠Q ♠J ♠ 10 ♠9 ♠8 ♠7 ♠6 ♠5 ♠4 ♠3 ♠2 ♠A P([A: diamonds] OR [B: “face”s]) ♦K ♦Q ♦J ♦ 10 ♦9 ♦8 ♦7 ♦6 ♦5 ♦4 ♦3 ♦2 Summer 2003 IS 601 ♦A ♣K ♣Q ♣J ♣ 10 ♣9 ♣8 ♣7 ♣6 ♣5 ♣4 ♣3 ♣2 ♣A Ö. S. ♥K ♥Q ♥J ♥ 10 ♥9 ♥8 ♥7 ♥6 ♥5 ♥4 ♥3 ♥2 Benli ♥A ♠K ♠Q ♠J ♠ 10 ♠9 ♠8 ♠7 ♠6 ♠5 ♠4 ♠3 ♠2 ♠A Addition Rule: If A and B are not mutually exclusive events, then P(A or B) = P(A) + P(B) - P(A and B) Summer 2003 IS 601 Ö. S. Benli P(Diamonds OR Faces) = P(Diamonds) + P(Faces) – P(Diamonds AND Faces) ♦K ♦Q ♦J ♦ 10 ♦9 ♦8 ♦7 ♦6 ♦5 ♦4 ♦3 ♦2 Summer 2003 IS 601 ♦A ♣K ♣Q ♣J ♣ 10 ♣9 ♣8 ♣7 ♣6 ♣5 ♣4 ♣3 ♣2 ♣A Ö. S. ♥K ♥Q ♥J ♥ 10 ♥9 ♥8 ♥7 ♥6 ♥5 ♥4 ♥3 ♥2 Benli ♥A ♠K ♠Q ♠J ♠ 10 ♠9 ♠8 ♠7 ♠6 ♠5 ♠4 ♠3 ♠2 ♠A The third law of probability If A and B are two events, then the conditional probability of A given B, P(A|B) = P(A and B) / P(B). Equivalently, P(A and B) = P(A|B) × P(B). Summer 2003 IS 601 Ö. S. Benli P(K|Diamond) = P(K & Diamond) / P(Diamond) ♦K ♦Q ♦J ♦ 10 ♦9 ♦8 ♦7 ♦6 ♦5 ♦4 ♦3 ♦2 Summer 2003 IS 601 ♦A ♣K ♣Q ♣J ♣ 10 ♣9 ♣8 ♣7 ♣6 ♣5 ♣4 ♣3 ♣2 ♣A Ö. S. ♥K ♥Q ♥J ♥ 10 ♥9 ♥8 ♥7 ♥6 ♥5 ♥4 ♥3 ♥2 Benli ♥A ♠K ♠Q ♠J ♠ 10 ♠9 ♠8 ♠7 ♠6 ♠5 ♠4 ♠3 ♠2 ♠A Distribution of students American International Total Men 25 15 40 Women 45 15 60 Total 70 30 100 Summer 2003 IS 601 Ö. S. Benli Probability Tables American International Total Men .25 .15 .40 Women .45 .15 .60 Total .70 .30 1.00 Summer 2003 IS 601 Ö. S. Benli Probability Tables American International Total Men P(A&M) P(I&M) P(M) Women P(A&W) P(I&W) P(W) P(A) P(I) P(U) Total Summer 2003 IS 601 Ö. S. Benli Distribution of 100 People Heavy Wt. Medium Wt. Total Short 20 20 40 Tall 50 10 60 Total 70 30 100 Summer 2003 IS 601 Ö. S. Benli Probability Tables (Dependent Events) Heavy Wt. Medium Wt. Total Short .20 .20 .40 Tall .50 .10 .60 Total .70 .30 1.00 Summer 2003 IS 601 Ö. S. Benli Probability Tables (Dependent Events) Heavy Wt. Medium Wt. Total Short P(S&H) P(S&M) P(S) Tall P(T&H) P(T&M) P(T) P(H) P(M) 1.0 Total Summer 2003 IS 601 Ö. S. Benli The fourth law of probability If A and B are independent events, then P(A|B) = P(A). Its implication: If A and B are independent events, then P(A and B) = P(A) × P(B). Summer 2003 IS 601 Ö. S. Benli A Distribution of 100 Students High GPA Low GPA Total Men 28 12 40 Women 42 18 60 Total 70 30 100 Summer 2003 IS 601 Ö. S. Benli Probability Tables (Independent Events) High GPA Low GPA Total Men .28 .12 .40 Women .42 .18 .60 Total .70 .30 1.00 Summer 2003 IS 601 Ö. S. Benli Probability Tables (Independent Events) High GPA Low GPA Total Men P(M&Hi) P(M&Lo) P(M) Women P(W&Hi) P(W&Lo) P(W) P(Hi) P(Lo) 1.0 Total Summer 2003 IS 601 Ö. S. Benli Table for Janes Problem Market is Market is Strong (S) Weak (W) Test is Positive (+) Test is Negative (-) Total Summer 2003 IS 601 Total P(S&+) P(W&+) P(+) P(S&-) P(W&-) P(-) P(S) P(W) P(U) Ö. S. Benli Table for Janes Problem Market is Market is Strong (S) Weak (W) Test is Positive (P) Test is Negative (N) Total Summer 2003 IS 601 0.30 0.70 Ö. S. Benli Total 1.00 Known probabilities are • P(S) = 0.30, ∴ P(W) = 0.70 • P(+|W) = 0.10 • P(-|S) = 0.20 From 3rd Law, • P(+&W) = P(+|W) ¥ P(W) = 0.07 Summer 2003 IS 601 Ö. S. Benli The third law of probability If A and B are two events, then the conditional probability of A given B, P(A|B) = P(A and B) / P(B). Equivalently, P(A and B) = P(A|B) × P(B). Summer 2003 IS 601 Ö. S. Benli Known probabilities are • P(S) = 0.30, ∴ P(W) = 0.70 • P(+|W) = 0.10 • P(-|S) = 0.20 From 3rd Law, • P(+&W) = P(+|W) ¥ P(W) = 0.07 Summer 2003 IS 601 Ö. S. Benli Table for Janes Problem Market is Strong Market is Weak Test is Positive Total 0.07 Test is Negative Total Summer 2003 IS 601 0.30 0.70 Ö. S. Benli 1.00 Known probabilities are • P(S) = 0.30, ∴ P(W) = 0.70 • P(+|W) = 0.10 • P(-|S) = 0.20 From 3rd Law, • P(+&W) = P(+|W) ¥ P(W) = 0.07 • But P(+&W) + P(- &W) = P(W) i.e. 0.07 + P(- &W) = 0.70 ∴ P(- &W) = 0.63 Summer 2003 IS 601 Ö. S. Benli Table for Janes Problem Market is Strong Market is Weak Test is Positive 0.07 Test is Negative 0.63 Total Summer 2003 IS 601 0.30 0.70 Ö. S. Benli Total 1.00 Known probabilities are • P(S) = 0.30, ∴ P(W) = 0.70 • P(+|W) = 0.10 ∴ P(- |W) = 0.90 • P(-|S) = 0.20 ∴ P(+|S) = 0.80 From 3rd Law, • P(- &W) = P(- |W) ¥ P(W) = 0.63 • P(+&S) = P(+ |S) ¥ P(S) = 0.24 • P(- &S) = P(- |S) ¥ P(S) = 0.06 Summer 2003 IS 601 Ö. S. Benli Table for Janes Problem Test is Positive Market is Strong Market is Weak 0.24 0.07 Test is Negative Total Summer 2003 IS 601 Total 0.63 0.30 0.70 Ö. S. Benli 1.00 Table for Janes Problem Market is Strong Market is Weak Test is Positive 0.24 0.07 Test is Negative 0.06 0.63 Total 0.30 0.70 Summer 2003 IS 601 Ö. S. Benli Total 1.00 Table for Janes Problem Market is Strong Market is Weak Total Test is Positive 0.24 0.07 0.31 Test is Negative 0.06 0.63 Total 0.30 0.70 Summer 2003 IS 601 Ö. S. Benli 1.00 Table for Janes Problem Market is Strong Market is Weak Total Test is Positive 0.24 0.07 0.31 Test is Negative 0.06 0.63 0.69 Total 0.30 0.70 1.00 Summer 2003 IS 601 Ö. S. Benli A Posteriori Probabilities From the 3rd Law • P(S|+) = P(S&+) / P(+) = .24/.31 = .774 • P(W|+) = P(W&+) / P(+) = .07/.31 = .226 • P(S|-) = P(S&-) / P(-) = .06/.69 = .087 • P(W|-) = P(W&-) / P(-) = .63/.69 = .913 Already computed P(+) = 0.31 and P(-) = 0.69 Summer 2003 IS 601 Ö. S. Benli Approach for performing probability calculations 1. Clearly and unambiguously define the various events that characterize the uncertainties in the problem. 2. Formalize how these events interact • • • Conditional probability: “A|B” Conjunctions: “A and B” Disjunctions: “A or B” Summer 2003 IS 601 Ö. S. Benli 3. When appropriate, organize all information regarding the probabilities of the various events in a table. 4. Using the laws of probability, calculate the probability of events that characterize the uncertainties in question. Summer 2003 IS 601 Ö. S. Benli Random variable • The uncertain quantity that is the numerical outcome in a probability model. – Discrete: assume values that are distinct and separate – Continuous: can take on any value within some interval of numbers Summer 2003 IS 601 Ö. S. Benli RANDOM VARIABLE is a function that assigns a real number to each element of the sample space. Summer 2003 IS 601 Ö. S. Benli Random experiment: selecting one student at random from the student body. Random variables: the student’s – – – – – Numerical variables that describe the properties of randomly selected student. Summer 2003 IS 601 Ö. S. Benli height, weight, family income, SAT score, GPA NOTATION: The “variable” is written with a capital “X”. The lowercase “x” represents a single observed value of X. For example, x = 2, if heads comes up twice. Summer 2003 IS 601 Ö. S. Benli This table is called the PROBABILITY DISTRIBUTION of random variable X. PROBABILITY FUNCTION is the rule that assigns a fraction to each distinct values of a random variable. Summer 2003 IS 601 Ö. S. Benli Table Histogram Summer 2003 IS 601 Ö. S. Benli Experiment: toss of two dice Y = sum of the dots on the two dice. Y = {2, 3, … , 12} Summer 2003 IS 601 Ö. S. Benli This table is called the PROBABILITY DISTRIBUTION of random variable X. PROBABILITY FUNCTION is the rule that assigns a fraction to each distinct values of a random variable. Summer 2003 IS 601 Ö. S. Benli Table Histogram Summer 2003 IS 601 Ö. S. Benli Discrete probability distributions • Binomial Distribution – X is a binomial random variable drawn from a sample size n and with a probability of success p. Summer 2003 IS 601 Ö. S. Benli X= # of heads in the toss of 4 coins, P(Head)=0.3 Event X P(X) TTTT 0 1× p^0 × (1-p)^4 = 0.2401 HTTT, THTT, TTHT, TTTH 1 4 × p^1 × (1-p)^3 = 0.4116 HHTT, HTHT, HTTH, THTH, THHT, TTHH 2 6 × p^2 × (1-p)^2 = 0.2646 HHHT, HHTH, HTHH, THHH 3 4 × p^3 × (1-p)^1 = 0.0756 HHHH Summer 2003 IS 601 4 1× p^4 × (1-p)^0 Ö. S. Benli = 0.0081 Summary Measures of Probability Distributions • Mean or expected value • Variance • Standard deviation Summer 2003 IS 601 Ö. S. Benli Expected value of an “uncertain event” (a “random variable”): weighted average of all possible numerical outcomes, with probabilities of each of the possible outcomes used as the weights. Summer 2003 IS 601 Ö. S. Benli • Linear functions of a random variable • Covariance • Correlation • Joint probability distributions • Independence of random variables • Sums of two random variables Summer 2003 IS 601 Ö. S. Benli