Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Lecture 1. Basics of probability theory Mathematical Statistics and Discrete Mathematics November 2nd, 2015 1 / 21 Sample space A sample space S is the set of all possible outcomes ω of an experiment. ! toss a coin, S = {heads, tails}. ! toss a die, S = {1, 2, 3, 4, 5, 6}. ! pick a card from a deck, S = {all possible cards}. ! lottery: choose six numbers from 1 to 49, S = {all six-element subsets of {1, 2, . . . , 49}}. 2 / 21 Events An event A ⊂ S is any subset of the sample space, that is, any collection of possible outcomes. • The full set S is called the certain event. • The empty set ∅ is called the impossible event. • An event containing exactly one outcome is called an elementary event. Pick a card, we can have !A !A !A 1 2 3 = {hearts}. = {even numbers}. = {queen of hearts}, elementary event. Usually, an event will take the general form A = {all outcomes (not) satisfying a specific condition}. 3 / 21 Event operations (Boolean operations) The intersection of events A and B is A ∩ B = {all outcomes that are both in A and B} = {ω : ω ∈ A & ω ∈ B}. If A ∩ B = ∅, then we say that A and B are disjoint. S A A∩B B S A B Figure: left: events with a non-empty intersection on the left-hand side, right: disjoint events. ! Let A = {1, 2, 3, 4}, B = {2, 4, 5, 6}. Then, A ∩ B = {2, 4}. ! Let A = {1, 2, 3}, B = {4, 5, 6}. Then, A ∩ B = ∅. ! Let A = [0, 7), B = [5, 10). Then, A ∩ B = [5, 7). 4 / 21 Event operations (Boolean operations) The union of events A and B is A ∪ B = {all outcomes that are in A or B} = {ω : ω ∈ A or ω ∈ B}. S A∪B ! Let A = {1, 2, 3, 4}, B = {2, 4, 5, 6}. Then, A ∪ B = {1, 2, 3, 4, 5, 6}. ! Let A = {1, 2, 3}, B = {4, 5, 6}. Then, A ∪ B = {1, 2, 3, 4, 5, 6}. ! Let A = [0, 7), B = [5, 10). Then, A ∪ B = [0, 10). 5 / 21 Event operations The difference between events A and B is A \ B = {all outcomes that are in A but not in B} = {ω : ω ∈ A & ω ∈ / B}. S A\B B ! Let A = {1, 2, 3, 4}, B = {2, 4, 5, 6}. Then, A \ B = {1, 3}. ! Let A = {1, 2, 3}, B = {4, 5, 6}. Then, A \ B = A = {1, 2, 3}. ! Let A = [0, 7), B = [5, 10). Then, A \ B = [0, 5), B \ A = [7, 10). 6 / 21 Event operations (Boolean operations) The complement of event A is Ac = S \ A = {all outcomes that are not in A} = {ω : ω ∈ / A}. S Ac A Note that to know the complement, we have to know the sample space S. ! Let A = {1, 2, 3, 4}, S = {1, 2, 3, 4, 5, 6}. Then, A ! Let A = [0, 7), B = [0, 10). Then, A = [7, 10). c c = {5, 6}. 7 / 21 Classical definition of probability We say that an event A occurs if ω ∈ A. By P(A) we denote the probability that A occurs. By P(ω) we denote the probability that the elementary event {ω} occurs, or in other words, that the outcome of the experiment is equal to ω. Main motivating question: How to compute probabilities of events? We can agree that • The certain event occurs with probability 1, and we write P(S) = 1 • The impossible event occurs with probability 0, and we write P(∅) = 0. But what about other events? 8 / 21 Classical definition of probability The classical probability is defined by P(A) = number of outcomes in A |A| = . |S| number of all outcomes In particular, for any outcome ω, P(ω) = 1/|S|. This is also called the uniform probability since the probability does not depend on the particular outcome ω. Pick a random card from a deck. Let A1 = {hearts}, A2 = {even numbers}, A3 = {queen of hearts}. We have |A1 | = 13, |A2 | = 20, |A3 | = 1, and |S| = 52. Then, P(A1 ) = 1 13 = , 52 4 P(A2 ) = 20 5 = , 52 13 P(A3 ) = 1 . 52 9 / 21 Counting configurations Suppose we run n experiments and we record all the outcomes. • If we care about the order of the experiments, we record the combined outcome as an ordered sequence of n outcomes. • If we do note care about the order of the experiments, we record the combined outcome as an unordered collection of n outcomes. Depending on the type of the experiments, we may or may not get repeated outcomes. Combined outcomes are called configurations. Choose two letters out of a, b, c. The combined sample space S of all configuraions is ! S = {aa, ab, ac, ba, bb, bc, ca, cb, cc}, |S| = 9, if we care about the order and allow repetitions, ! S = {ab, ac, ba, bc, ca, cb}, |S| = 6, if we care about the order and don’t allow repetitions, ! S = {{aa}, {ab}, {ac}, {bb}, {bc}, {cc}}, |S| = 6, if we don’t care about the order and allow repetitions, ! S = {{ab}, {ac}, {bc}}, |S| = 3, if we don’t care about the order and don’t allow repetitions. 10 / 21 Counting configurations Configurations • where the order matters and there are no repetitions are called permutations, • where the order does not matter and there are no repetitions are called combinations. What is the size of the combined sample space? The multiplication rule says that if we run k experiments, and we know that the sample space of the ith experiment contains ni outcomes. Then, the combined sample space contains k Y ni = n1 · n2 · . . . · nk i=1 configurations. 11 / 21 Counting configurations Recall that n factorial is n! = 1 · 2 · 3 · . . . · n = n · (n − 1) · (n − 2) · . . . · 1. We have an urn with n balls of different colours. We choose one ball in each of the total of k turns ! If n ≥ k and we do not put back the ball into the urn, then the first time we have n balls to choose from, the second time we have n − 1 balls, and so on, until we are left with n − k balls. Hence, the number of all possible configurations is n · (n − 1) · . . . · (n − k + 1) = n! . (n − k)! If n = k, we get n! configurations. ! If we put the ball back into the urn, then each time we have n balls to choose from. Hence, the number of all possible configurations is k |n · n ·{z. . . · n} = n . k times 12 / 21 Counting configurations We have an urn with k balls of different colours. We choose one ball in each of the total of n turns ! If n ≤ k, and we do not put back the ball into the urn, and we do not care about the order, the number of all possible configurations is n n! := , (You read this notation: n choose k) k k!(n − k)! ! The above number is just the number of all k-element subsets of an n-element ∼ 1/(14 · 10 ). set. Hence, the probability of winning the lottery is 1/ ! If we put back the ball into the urn, and we do not care about the order, the 49 6 6 number of all possible configurations is n+k−1 . k Exercise for the willing: prove the last formula. 13 / 21 Axioms of general probability Toss a coin. In classical probability P(heads) = 12 . But what about asymmetric coins? Axioms of a probability (measure): A probability (measure) P is an assignment of numbers to events such that (i) P(A) ≥ 0 for all events A (probablities are non-negative), (ii) P(S) = 1 (probability of the certain event is 1), (iii) P(A ∪ B) = P(A) + P(B) for disjoint events A and B (probabilities add up on disjoint events). ! Classical probability satisfies all the axioms of a probability measure. ! For a toss of a coin, let P(heads) = 2/3, P(tails) = 1/3, P(∅) = 0, and P({heads, tails}) = 1. Then P satisfies the axioms above. 14 / 21 Axioms of general probability Important consequences of the axioms: (a) 0 ≤ P(A) ≤ 1 for any event A. (b) P(∅) = 0. (c) P(A) = 1 − P(Ac ) for any event A. (d) P(A) ≤ P(B) if A ⊂ B. (e) If A1 , A2 , . . . , An are pairwise disjoint, that is Ai ∩ Aj = ∅ for i 6= j, then P(A1 ∪ A2 ∪ . . . ∪ An ) = P(A1 ) + P(A2 ) + . . . + P(An ) = n X P(Ai ). i=1 (f) P(A ∪ B) = P(A) + P(B) − P(A ∩ B) for all events A and B. (g) P(A ∪ B) ≤ P(A) + P(B) for all events A and B. 15 / 21 Axioms of general probability Proof of: (a) The lower bound P(A) ≥ 0 is axiom (i). It follows from axiom (iii) by taking B = Ac , that P(A) + P(Ac ) = 1. The upper bound P(A) ≤ 1 follows from this equality since by axiom (i), P(Ac ) ≥ 0. (b) By axiom (iii), P(∅) = P(∅) + P(∅) since the empty event is disjoint with itself. (c) It follows from axiom (iii) by taking B = Ac . (d) By axiom (iii) and (i), and since A ⊂ B, P(B) = P(A) + P(B \ A) ≥ P(A). (e) Apply axiom (iii) n times. (f) By equation (c), P(A ∪ B) = P(A \ B) + P(B \ A) + P(A ∩ B), and P(A \ B) = P(A) − P(A ∩ B) and P(B \ A) = P(B) − P(A ∩ B). (g) It follows from equation (e) and the fact that P(A ∩ B) ≥ 0. 16 / 21 Conditional probability Main motivating question: Suppose that we know that an event occurs. What information does that give us about other events? Suppose B has positive probability of occurrence, that is P(B) > 0. For any event A, the probability of A conditioned on B (or probability of A given B) is P(A|B) := P(A ∩ B) . P(B) Throw a die. Let A = {1, 2, 3}. Then P(A) = 1/2 in classical probability, and ! P(A|{even}) = P(A ∩ {even})/P({even}) = 1/6 1/2 = 1/3. Since P(A|{even}) < P(A), the occurrence of event {even} has a negative influence on the probability of occurrence of A. ! P(A|{odd}) = P(A ∩ {odd})/P({odd}) = 2/6 1/2 = 2/3. Since P(A|{odd}) > P(A), the occurrence of event {odd} has a positive influence on the probability of occurrence of A. P(A|B) and P(B|A) have different interpretations and usually take different values. 17 / 21 Independent events Two events A and B are called independent if and only if P(A ∩ B) = P(A)P(B). If A and B are independent, and P(B) > 0 then P(A|B) = P(A)P(B) P(A ∩ B) = = P(A). P(B) P(B) Hence, the occurrence of B does not influence the (probability of) occurrence of A. ! ∅ and S are always independent of all other events, ! Throw a die twice and call the outcomes ω and ω . The events A and A = {ω = 6} are independent. ! The events A = {it is raining somewhere in New Zealand} and 1 2 2 2 1 = {ω1 = 1} B = {it is raining in Gothenburg} are independent. 18 / 21 Total probability formula Let B be any event, and A1 , A2 , . . . , An be pairwise disjoint events, that is Ai ∩ Aj = ∅ for i 6= j, and such that A1 ∪ A2 ∪ . . . ∪ An = S. S B A1 ... A2 Then, P(B) = n X An P(B|Ai )P(Ai ). i=1 19 / 21 Total probability formula Proof. P(B) = P(B ∩ S) = P(B ∩ (A1 ∪ A2 ∪ . . . ∪ An )) = P((B ∩ A1 ) ∪ (B ∩ A2 ) ∪ . . . ∪ (B ∩ A2 )) n X = P(B ∩ Ai ) i=1 = n X P(B|Ai )P(Ai ). i=1 20 / 21 Bayes’ theorem With B such that P(B) > 0, and A1 , A2 , . . . , An as before, we have for any i ∈ {1, 2, . . . , n}, P(B|Ai )P(Ai ) . P(Ai |B) = Pn i=1 P(B|Ai )P(Ai ) The probability of having a certain disease is 10%. Patients with the disease test positive with probability 95%. Patients without the disease test positive with probability 2%. Given a positive test result, what is the probability that the patient has the disease? Let A1 = {patient has the disease} A2 = Ac1 = {patient does not have the disease} B = {patient tests positive}. We have P(A1 |B) = 0, 95 · 0, 10 P(B|A1 )P(A1 ) = = 0.84. P(B|A1 )P(A1 ) + P(B|A2 )P(A2 ) 0, 95 · 0, 10 + 0, 02(1 − 0, 10) 21 / 21