Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
2-1 Probability on Finite Sets 2 - Probability on Finite Sets • Modeling physical phenomena • Probability on finite sets • The sample space and the pmf • Events • Unions, intersections and complements • The axioms of probability • Partitions • Conditional probability • Independence • Dependent events S. Lall, Stanford 2011.01.04.01 2-2 Probability on Finite Sets S. Lall, Stanford 2011.01.04.01 Modeling Physical Phenomena • The deterministic way: write differential equations Differential equations are constructed from • physical principles • experiments Use the model to predict the approximate behavior of the system • The probabilistic way: specify the probability of events Probabilities are derived from • physical principles, e.g., time symmetry for noise • experiments; i.e., observed frequency of events Again, we use the model to predict the approximate behavior of the system e.g., in approximately 95% of trials the hovercraft will deviate less than 3m from the given trajectory 2-3 Probability on Finite Sets S. Lall, Stanford 2011.01.04.01 Probability on Finite Sets • The sample space is a finite set Ω; it’s elements are called outcomes. Exactly one outcome occurs in every experiment. • Function p : Ω → [0, 1] is called a probability mass function (pmf) if X p(a) ≥ 0 for all a ∈ Ω and p(a) = 1 a∈Ω Then p(a) is the probability that outcome a ∈ Ω occurs sum of three dice product of two dice 0.14 0.12 0.12 0.1 0.1 0.08 0.08 p(a) 0.06 p(a) 0.04 0.04 0.02 0.02 0 0.06 1 3 5 7 9 a 11 13 15 17 0 0 5 10 15 20 a 25 30 35 2-4 Probability on Finite Sets S. Lall, Stanford 2011.01.04.01 Events An event is a subset of Ω For example, if Ω = {1, . . . , 2n}, the following are events • A = { 2, 4, 6, . . . , 2n }, which we would call the event that the outcome is even • A = { x ∈ Ω | x ≥ 32 }, which we would call the event that the outcome is ≥ 32 The probability of an event is X Prob(A) = p(b) b∈A Prob : 2Ω → [0, 1] is called a probability measure Example: Prob(A) = 0.275, Prob(B) = 0.65 A B 2-5 Probability on Finite Sets S. Lall, Stanford Unions, Intersections and Complements For any sets A, B ⊂ Ω we have Prob(A ∪ B) = Prob(A) + Prob(B) − Prob(A ∩ B) We interpret A∪B A∩B Ac = { b ∈ Ω | b 6∈ A } is the event that A or B happens is the event that A and B happens is the event that A does not happen 2011.01.04.01 2-6 Probability on Finite Sets S. Lall, Stanford Notation Notice that Prob really depends on • the sample space Ω • and the probability mass function p Sometimes we will write Prob(A) Ω, p to specify which Ω and p are being used 2011.01.04.01 2-7 Probability on Finite Sets S. Lall, Stanford 2011.01.04.01 Axioms of Probability We have for all A, B ⊂ Ω (i) Prob(A) ≥ 0 (ii) Prob(Ω) = 1 (iii) if A ∩ B = ∅ then Prob(A ∪ B) = Prob(A) + Prob(B) • The above three conditions are called the axioms of probability for finite sets Ω • If Prob : 2Ω → R satisfies the above, then we can construct a probability mass function via ¡ ¢ p(b) = Prob {b} for all b ∈ Ω and p will be positive and sum to one as required. 2-8 Probability on Finite Sets S. Lall, Stanford 2011.01.04.01 Partitions The set of events A1, A2, . . . , An is called a partition of Ω if Ai ∩ Aj = ∅ for all i 6= j A1 ∪ A2 ∪ · · · ∪ An = Ω called mutually exclusive called collectively exhaustive Then for any B ⊂ Ω we have Prob(B) = n X Prob(B ∩ Ai) i=1 called the Law of Total Probability A1 A2 A4 A3 B A5 2-9 Probability on Finite Sets S. Lall, Stanford 2011.01.04.01 Conditional Probability Suppose A and B are events, and Prob(B) 6= 0. Define the conditional probability of A given B by Prob(A ∩ B) Prob(A | B) = Prob(B) Example: suppose B = { x ∈ Ω | x ≥ 10} 0.14 0.25 0.12 0.2 0.1 0.15 0.08 0.06 0.1 0.04 0.05 0.02 0 1 3 5 7 9 11 13 15 17 0 2 4 6 8 10 12 14 16 18 If we perform many repeated experiments, and throw away all x 6∈ B, then the observed frequency of outcomes x ∈ B will increase. 2 - 10 Probability on Finite Sets S. Lall, Stanford Conditional Probability Conditioning defines a new probability mass function on Ω. The conditional pmf is p(a) if a ∈ B p2(a) = Prob(B) 0 otherwise Then we have, for any A ⊂ Ω Prob(A | B) = Prob(A) Ω, p Ω, p2 0.14 0.25 0.12 0.2 0.1 0.15 0.08 0.06 0.1 0.04 0.05 0.02 0 1 3 5 7 9 11 13 15 17 0 2 4 6 8 10 12 14 16 18 2011.01.04.01 2 - 11 Probability on Finite Sets S. Lall, Stanford 2011.01.04.01 Independence Two events A and B are called independent if Prob(A ∩ B) = Prob(A) Prob(B) • If Prob(B) 6= 0 this is equivalent to Prob(A | B) = Prob(A) • if A and B are dependent, then knowing whether event A occurs also gives information regarding event B 2 - 12 Probability on Finite Sets S. Lall, Stanford 2011.01.04.01 Independence Events A and B are independent if and only if rank(M ) = 1 where · ¸ c Prob(A ∩ B) Prob(A ∩ B ) M= Prob(Ac ∩ B) Prob(Ac ∩ B c) A B M is called the joint probability matrix. • A and B are independent means the probabilities of A occurring do not change when we discard those outcomes when B occurs. • The probabilities of A and Ac occurring are the row sums · ¸ Prob(A) = M1 Prob(Ac) When rank(M ) = 1, each column is some multiple of M 1 · ¸ ¤ Prob(A) £ c Prob(B) Prob(B ) M= Prob(Ac) 2 - 13 Probability on Finite Sets S. Lall, Stanford 2011.01.04.01 Example: two dice Two dice. Pick sample space © ª Ω = (ω1, ω2) | ωi ∈ {1, 2, . . . , 6} 6 5 4 A 3 Two events are 2 • the sum is greater than 5 © ª A = ω ∈ Ω | ω1 + ω2 > 5 1 • the first dice is greater than 3 © ª B = ω ∈ Ω | ω1 > 3 6 1 2 3 4 5 6 5 4 B 3 2 1 1 2 3 4 5 6 2 - 14 Probability on Finite Sets S. Lall, Stanford 2011.01.04.01 Example: two dice By measuring B, we have information about A, because 26 Prob(A) = 36 17 Prob(A | B) = 18 6 5 4 A 3 2 1 • This is an example of estimation 1 2 3 4 5 6 • By measuring one random quantity, we have information about another • More refined questions: what is the conditional distribution of the sum? What should we pick as an estimate? 6 • Later we will see problems of the form 4 y = Ax + w 5 B 3 2 w is random, we measure y, and would like to know x 1 1 2 3 4 5 6