Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Lecture 16, Probability An experiment is a procedure that yields an outcome. The sample space is the set of all possible outcomes. Sample space is usually denoted by Ω and an outcome is denoted by x. An event A is a subset from the sample space. For each outcome x, we assign an number p(x) ∈ [0, 1] to it. Essentially, p is a function from the sample space to the interval [0, 1], then X P (A) = p(x), A ⊂ Ω. x∈A If p satisfies (1) p(x) ≥ 0, ∀x ∈ Ω (2) If A ∩ B = ∅, then P (A ∪ B) = P (A) + P (B) (3) p(Ω) = 1, p is called a probability distribution. If |Ω| = n and for each outcome x ∈ Ω, p(x) = 1/n, then p is called uniform distribution. Properties of the probability function: (1) p(Ac ) = 1 − p(A) (2) If events A and B are not disjoint, p(A∪B) = p(A)+p(B)−p(A∩B) The last result can be extended as follows: Theorem 1. p(∪ni=1 Ai ) = n X X (−1)k+1 p(Ai1 ∩ Ai2 ∩ · · · ∩ Aik ) k=1 Proof. First, we use n = 3 to have a concrete expression of the theorem. In this case, it states p(A1 ∪ A2 ∪ A3 ) = (−1)1+1 [p(A1 ) + p(A2 ) + p(A3 )] + (−1)2+1 [P (A1 ∩ A2 ) + P (A1 ∩ A3 ) + P (A2 ∩ A3 )] + (−1)3+1 P (A1 ∩ A2 ∩ A3 ) Now, we prove the theorem. We consider the case where Ω has finite elements. In this case, we assume that ∪ni=1 Ai = {x1 , x2 , . . . , xm }. In other words, we assume the set has m elements and are denoted by x1 , x2 , . . . , xm . Then we have p(∪ni=1 Ai ) = p(x1 ) + p(x2 ) + · · · + p(xm ). We see that each xj contribute p(xj ) to the left-hand side of the theorem. We now examine what happens to the right-hand side of the theorem due to xj . Let’s consider a subset H of {1, 2, 3, . . . , n} and we require that there is at least one element ` ∈ H such that xj ∈ A` . Using the integers in H as indices we can generate a set in the form of Ai1 ∩ Ai2 ∩ · · · Aik , and xj contribute p(xj ) to its probability. 1 2 Notice that if H has even number of elements, the probability of the set generated by H is multiplied with −1; otherwise it is multiplied with 1. This means, for all events that include xj on the right-hand side, xj contribute −p(xj ) if the event has even number of outcomes and contribute p(xj ) if the event has odd number of outcomes. From the binomial theorem, we know that the number of subsets with even elements is the same as the number of subsets with odd elements. n X n n n n n 0 = (1 − 1)n = ⇒ + + ··· = + + ··· j 0 2 1 3 j=0 Note that the empty set has 0 (even number) elements, but xj doesn’t contribute anything to it. That means, on the right-hand side of the theorem, among all the events containing xj , the number of subsets with odd numbers is exactly 1 more than that of the subsets with even numbers. Therefore, the net contribution of xj to the right-hand side is p(xj ). Applying this argument to each of the element xj , j = 1, . . . , m, we prove the correctness of the theorem.