Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
MTH 202 : Probability and Statistics Lecture 1 & 2 6, 7 January, 2013 Probability 1.1 : Introduction In probability theory we would be dealing with random experiments and analyze their outcomes. The word ”random” means that : 1. the particular outcome of an experiment would be unknown, 2. all possible outcomes of the experiment would be known in advance, 3. the experiments can be repeated under identical conditions. Examples 1.1.1 : A. Car-Goat Problem : This was a famous problem originally posed in 1975 in the American Statistician by Steve Selvin (Also known as Monty Hall Problem based on the American television game show ”Let’s Make a Deal” and named after its original host, Monty Hall). Figure 1. Monty Hall Problem The problems is stated as follows : Suppose you are in a game show and are asked to choose one of the three given door; behind one of the doors has a brand new shiny car of your dream. The other two hides a goat each. You pick a door, say no. 1, and the host, who knows what’s 1 2 behind the doors, opens another door, say no. 3, which has a goat. He then says to you, ”Do you want to pick door no. 2?” Is it to your advantage to switch your choice? Let us model the problem. Each outcome of the experiment here can be described by a quadruple (x, y, z, w), where x represents the number on the door you would choose, y represents the number on the door the host would open, z represents the number of the door you would switch to, and w represents one of W or L, depending upon whether you win or lose. Assuming that the door No. 1 hide the car, the sample space Ω would look like : Ω = {(1, 2, 3, L), (1, 3, 2, L), (2, 3, 1, W ), (3, 2, 1, W )} The probability of choosing door no. 1 is 1/3. It may be expressed as the event A = {(1, 2, 3, L), (1, 3, 2, L)} represents choosing door no. 1, and P A = 1/3. But if you choose door no. 1 you are going to lose. However, if you are choosing any one of the doors numbered 2 or 3, you are going to win; each of these have probability 1/3, totaling gives the probability of winning is equal to 2/3. Now consider the other scenario where you are sticked to your choice. Assuming again that door no. 1 hide the car, the sample space Ω would look like : Ω = {(1, 2, 1, W ), (1, 3, 1, W ), (2, 3, 2, L), (3, 2, 3, L)} In this case, considering A being the event of choosing door no. 1, we have P A = 1/3 which is the probability of winning which is half of what was if you decide to switch. B. Tossing coin(s) : Let’s toss a coin and the sample space would be Ω = {H, T }. There are 50% chance either winning or losing the toss. Hence roughly you would assign P (winning) = 1/2 and P (losing) = 1/2. Now suppose you wish to toss a coin three times hoping that there would be head at least twice. Let us model the space Ω first. Ω = {(H, H, H), (H, H, T ), (H, T, H), (H, T, T ), (T, H, H), (T, H, T ), (T, T, H), (T, T, T )} Out of the eight possibilities, there are four outcomes while it has two heads. So roughly you would say there are 50% chance of getting two heads. You can mathematically say P (two heads) = 1/2. Similarly, 3 Figure 2. Tossing a coin you can find out that P (two tails) = 1/2 and these two add up to 1 (Why?). C. Waiting for the DBCity Bus : (Continuous sample space) Suppose Brahadeesh wishes to catch one of the buses which goes to DBCity Mall which starts at 2 : 15 P.M. and 3 : 15 P.M. from the stop near students hostel. He however, decide to arrive at the bus stop at a random moment between 2 P.M. to 3 : 30 P.M. Now he intend to calculate the probability that he would have to wait for more than ten minutes for the bus. The sample space here would be an interval rather than a finite or a discrete set. Assuming the time starts at 2 P.M. and counting every minute as an unit, the sample space Ω would be the interval Ω = [0, 90]. He would have to wait for the bus for more than ten minutes if he arrives at the stop at anytime : (a) between 2 and 2 : 05 P.M., (b) between 2 : 15 and 3 : 05 P.M., (c) between 3 : 15 and 3 : 30 P.M. (In this case considering that the probability of Brahadeesh arriving at the bus stop exactly at 2 : 15 P.M. or 3 : 15 P.M. is zero). Now all the three events A, B, C as in (a), (b), (c) are ”disjoint” events and probability of happening them would be intuitively calculated as : P A = 5/90, P B = 4 Figure 3. Waiting for the DBCity Bus 50/90, P C = 15/90. Thus the probability of Brahadeesh waiting for the bus would be P (A ∪ B ∪ C) = P A + P B + P C = 7/9. Finally in this case, note that the events as discussed, are certain closed sub-intervals of [0, 90]. In aR bit more mathematical language calculating 65 P B can be thought of as ( 15 dt)/90 = 5/9 (modeling B by the interval [15, 65]). There are certain subsets of the closed interval [0, 90] on which integration cannot be done. Upon this realization we would be needing to consider certain specific subsets of [0, 90] which can be modeled and which would satisfy certain properties. These family of subsets are known as σ-field. 1.2 : Axiomatic Definitions Definition 1.2.1 : Let X be a non-empty set. A non-empty collection F of subsets of X is called a field (or an algebra) on X if : (i) E1 , E2 ∈ F, then E1 ∪ E2 ∈ F, (ii) E ∈ F, then E c ∈ F. Note that a field on X would always contain ∅ and X. Moreover, {∅, X} forms a field, which is known as the smallest possible field. In examples 1.1.1(A) and (B), note that the collection P(Ω) all subsets of Ω (What are these ?) form a field. 5 Definition 1.2.2 : Let X be a non-empty set. A non-empty collection F of subsets of X is called a σ-field (or a σ-algebra) if : (i) for countably many elements E1 , E2 , . . . , En , . . . from F the union ∪∞ j=1 Ej is also an element of F, (ii) E ∈ F, then E c ∈ F. Note that a σ-field is also a field. As before the collection {∅, X} forms the smallest σ-field. Exercise 1.2.3 : Let F be a field of subsets of a non-empty set X. (i) If E1 , E2 , . . . , Ek be finitely many subsets of F. Show that the union ∪kj=1 Ej and the intersection ∩kj=1 Ej belong to F. (ii) If E, F ∈ F, then show that E \ F, E∆F ∈ F, where E \ F is the difference set and E∆F is the symmetric difference (E \ F ) ∪ (F \ E). Exercise 1.2.4 : Show that a field on a set X is also a σ-field if X is finite. (Hence the notion of a σ-field and a field has no difference on finite sets). While we are engaged in a random experiment and trying to find out some probability of a particular event of our own interest, it is often required to model all possible events. We will call them a sample space. Defintion 1.2.5 : A sample space of a random experiment is a pair (Ω, S), where : (i) Ω is the set of all possible outcomes of the experiment, (ii) S is a σ-algebra on Ω (i.e., S ⊆ P(Ω)). If (Ω, S) is a sample space, the members of S are the events that can be well understood after a suitable assignment of the probability on them. We now turn to define probability on a sample space : Definition 1.2.6 : Let (Ω, S) be a sample space. A function P : S → [0, ∞) is called a probability measure (or simply ”probability”) if : (i) P (A) ≥ 0 for all A ∈ S, (ii) P (Ω) = 1, (iii) for a sequence {Aj }∞ j=1 of sets from S which are mutually disjoint with each other (i.e. Aj ∩ Ak = ∅ if j 6= k) we have P (∪∞ j=1 Aj ) = ∞ X j=1 P (Aj ) 6 P (A) is often referred as the probability of the event A. If there is no confusion, we would simply write P A instead of P (A). The triple (Ω, S, P ) is called a probability space. Let us consider defining probability on the examples considered. Examples 1.2.7 : A. Car-goat problem : Let (Ω, S) be the sample space where Ω = {(1, 2, 3, L), (1, 3, 2, L), (2, 3, 1, W ), (3, 2, 1, W )} and S = P(Ω). Define the function P on S by P ((2, 3, 1, W )) = 1/3, P ((3, 2, 1, W )) = 1/3, P ((1, 2, 3, L)) = 1/6, P ((1, 3, 2, L)) = 1/6. Verify that this defines a probability on the sample space. B. Tossing coins : (Ω, S) be the sample space where Ω is as defined and S = P(Ω). Define the function P on S by P ((x, y, z)) = 1/8 for every {(x, y, z)} ∈ P(Ω). Verify that this defines a probability on the sample space. C. Waiting for the DBCity bus : In this case Ω = [0, 90]. The sample space is usually known as space of all Borel subsets of the interval [0, 90]. To give the most appropriate probability measure it would require the knowledge of how would these sets look like in general. However, if A is the interval [a, b] where 0 ≤ a < b ≤ 90, we can simply define P A := b − a. Let us now prove some properties of probability as defined : Properties 1.2.8 : Let (Ω, S, P ) be a probability space. Then : (1) P ∅ = 0 Proof Let E1 := Ω and En := ∅ for n ≥ 2. Using the property Pall ∞ (iii) we have that P (Ω) = P (Ω) + n=2 P (En ). If P (∅) > 0, then the right side of the sum is infinite, since the positive number P (∅) is simply added infinitely many times and hence the sum is larger than any positive number (Why ?). Hence P (∅) = 0. (2) P (A1 ∪ A2 ∪ . . . ∪ Ak ) = P A1 + P A2 + . . . + P Ak if Ai ∈ S and Ai ∩ Aj = ∅ if i 6= j (finite additivity) Proof : Consider the sequence {En }∞ n=1 of sets defined by E1 := A1 , E2 := A2 , . . . , Ek = Ak and En := ∅ if n ≥ k + 1. We can now apply property (iii) which gives : P (A1 ∪ A2 ∪ . . . ∪ Ak ) = P (A1 ) + P (A2 ) + . . . + P (Ak ) + 0. 7 (3) P A = 1 − P Ac for A ∈ S Proof : Since A and Ac are disjoint and their union is Ω we have by (2) P A + P Ac = P (Ω) = 1 (Remember : Ac ∈ S since A ∈ S). (4) If E ⊆ F for two sets E, F ∈ S, then P E ≤ P F (monotonicity) Proof : Since E ⊆ F we have two disjoint sets E and F \E whose union is F . Hence by (2) we have 0 ≤ P (E) ≤ P (E) + P (F \ E) = P (F ). (5) P (A ∪ B) = P A + P B − P (A ∩ B) for A, B ∈ S Proof : The set A∪B can be broken into three disjoint sets A\B, A∩B and B \ A. Hence P (A ∪ B) = P (A \ B) + P (A ∩ B) + P (B \ A). But P (A) = P (A \ B) + P (A ∩ B) and P (B) = P (B \ A) + P (A ∩ B). Replacing P (A \ B) and P (B \ A) in the equation we have the result. Example 1.2.9 : Let two dice are rolled. Find out the probability of the event of getting a number which is at most 5 or at least 9. Solution : The set Ω of all possible outcomes is : Ω = {{i, j} : 1 ≤ i, j ≤ 6} Since the order of i and j are not important, the size of Ω is 21. Now if we denote the event of getting a pair totaled up to 5 by A and of getting a pair totaled at least 9 by B, we would need to calculate P (A∪ B). Now A = {{1, 1}, {1, 2}, {1, 3}, {1, 4}, {2, 2}, {2, 3}} and B = {{3, 6}, {4, 5}, {4, 6}, {5, 5}, {5, 6}, {6, 6}}. Hence P A = 6/21, P B = 6/21. Since A ∩ B = ∅, we have P (A ∪ B) = P A + P B = 12/21 = 4/7. Example 1.2.10 : A box contains 1000 light bulb. The probability that there is at least 1 defective bulb in the box is 0.1, and the probability that there are at least 2 defective bulbs is 0.05. Find the probability of each of the following cases : (a) the box contains no defective bulbs. (b) the box contains exactly 1 defective bulb. (c) the box contains at most 1 defective bulb. Solution : Let us the denote the following events : A denote the event of getting at least one defective bulb and B denote the event of getting at least two defective bulbs. It is given that P A = 0.1 and P B = 0.05. (a) Let C denote the event of getting no defective bulb. Then C = Ac and P C = 1 − P A = 0.9. (c) If D = {at most one defective bulb}, then D = B c and P D = 1 − P B = 0.95. (b) Now the event E = {exactly one defective bulb} = A∩D. Now A∪ D represents the event of getting at least one or at most one defective 8 bulb. Since these exhausts all possible events, we have P (A ∪ D) = 1 and P (A ∩ D) = P A + P D − P (A ∪ D) = 0.1 + 0.95 − 1 = 0.05. Example 1.2.11 : An absent minded secretary who places n letters in envelopes at random. Determine the probability that he or she will misplace every letter. Let Ω be the set of all permutation of n letters are placed at n envelopes (the envelopes are are marked by the digits 1, 2, . . . , n). Here we are assuming that the i-th letter is supposed to go to i-th envelope. Let Ai be all such possibility that the i-th letter is going to the i-th envelope. We note that : P Ai = (n − 1)!/n!, P (Ai ∩ Aj ) = (n − 2)!/n! (i 6= j), P (Ai ∩ Aj ∩ Ak ) = (n − 3)!/n! (i 6= j 6= k), . . . Finally, we apply the following formula : Theorem 1.2.12 : (Principle of Inclusion-Exclusion) Let A1 , A2 , . . . , An ∈ S. Then : n X X P (∪nk=1 Ak ) = P Ak − P (Ak1 ∩ Ak2 ) k=1 + X k1 <k2 P (Ak1 ∩ Ak2 ∩ Ak3 ) + . . . + (−1)n+1 P (∩nk=1 Ak ) k1 <k2 <k3 to get : 1 1 n+1 1 = 1 − + − . . . + (−1) 2! 3! n! It is interesting to see that as the number of letters increase, there is an upper bound of the probability that all letters are misplaced, which is e−1 . P (∪nk=1 Ak )