Download Document

Tutorial 8, STAT1301 Fall 2010, 16NOV2010, MB103@HKU By Joseph Dong Recall: A Partition on a Set   Any exhaustive and disjoint collection of subsets of a given set forms a partition of that set.  E.g.  𝐵, 𝐵𝑐 forms a trivial partition of the presumed set 𝑆 = 𝐵 ∪ 𝐵𝑐 .  If 𝑓: 𝑆 → 𝑇 = 1,2,3 , then the collection of pre-images of atoms of the range, 𝑓 −1 1 , 𝑓 −1 2 , 𝑓 −1 3 , forms a partition of the domain 𝑆. 2 Recall: Conditioning on a Partition   Shares the same idea with  Divide and Conquer  Casewise enumeration  A Tree-diagram  Formal language:  Goal = find the probability of event 𝐸, ℙ 𝐸 .  It is equivalent to finding the intersection of it with the sure event Ω. ℙ 𝐸 ≡ℙ 𝐸∩Ω . 3 Recall: Conditioning on a Partition (cont’d)   Formal language (continued)  Now break down the sure event into a number of manageable smaller pieces and these pieces together forms a partition {𝐴𝑘 |𝑘 ∈ 𝐼} of the sure event Ω.  If we investigate all such events 𝐸 ∩ 𝐴𝑘 , then we’re done. ℙ 𝐸 ∩ Ω = ∑ℙ 𝐸 ∩ 𝐴𝑘  The hardcore of the problem now becomes finding each ℙ 𝐸 ∩ 𝐴𝑘 , and this is where the conditioning takes place. ℙ 𝐸 𝐴𝑘 ⋅ ℙ 𝐴𝑘  Assuming it is more straight forward a task to find ℙ 𝐸 𝐴𝑘 and ℙ 𝐴𝑘 . 4 Recall: What does an R.V. do to its State Space?  An r.v. cuts the state space into blocks. On each of these blocks, the r.v. sends all points there to a common atom in the sample space.   An r.v. causes a partition on the state space.  Conversely, given a partition on the state space, you can also define random variables on it so that it “conforms” the partition by taking one value for each block. Random Variable Partition on 𝛀 5 Conditioning an Event on an R.V.   Since an r.v. cuts the state space into a partition, conditioning on an r.v. is just conditioning on that partition it caused on the state space.  The meaning of ℙ 𝐸 𝑋 is now clearly illustrated on the right. 6 ℙ 𝐸 𝑋 as a Random Variable   It contains a random variable 𝑋 inside, making itself a function of 𝑋.  It has a distribution and expectation.  Lotus  Question: What’s the meaning of its expected value?  To fix its value by fixing an 𝑋 value:  ℙ 𝐸 𝑋 = 𝑥1 , ℙ 𝐸 𝑋 ∈ 𝑥1 , 𝑥2  Every fixed value is now a conditional probability involving two events. 7 Exercise: Finding ℙ 𝐸 from ℙ 𝐸 𝑋   This is the prototypical problem of finding the probability of an event via the technique of conditioning on a random variable.  Hint: Ponder on the link between Law of Total Probability and Expectation.  Ans: ℙ 𝐸 =𝔼ℙ 𝐸𝑋 8 ℙ 𝑌𝑋   It involves two r.v.’s now.  Given ℙ 𝑌 𝑋 : ℙ 𝑌 𝑋 is a function of the  Q1: How to find ℙ 𝑌 = bivariate random vector 𝑋, 𝑌 .  Fixing 𝑋 will give you back the conditional density of 𝑌 given 𝑋 at the fixed position. 9 Conditional, Marginal, and Joint densities   Difference among 3 types of densities:  a conditional density ℙ 𝑌 𝑋  is normalized by the marginal probability of ℙ 𝑋  is a point dividing a row sum/integral  is the density of 𝑌|𝑋  a joint density ℙ 𝑋, 𝑌  is normalized by the entire joint space  is a point dividing the sum/integral of entire space  is the density of 𝑋, 𝑌  a marginal density ℙ 𝑌  is also normalized by the entire space  is a row sum dividing the sum/integral of entire space  is the density of 𝑌 10 Handout Problem 1  11 Recall: What’s the Expectation of a random variable   First of all, the random variable has to be numerically valued. That’s why expectation is also known as the “expected value” and is a numerical characteristic of the sample space (a subset of ℝ or simply ℝ itself with zero densities equipped at those impossible points). +∞ 𝔼 𝑋 = 𝑥𝑓𝑋 𝑥 𝑑𝑥 −∞  The expectation is both conceptually and technically equivalent to the location of the center of probability mass of the sample space.  Expectation provides only partial information of the random variable because it eliminates randomness by giving you back only 1 representative point of the sample space. 12 For examples,   𝐸|𝑋 is a set-valued random variable.  Given 𝑋 = 𝑥, it evaluates to the set 𝐸 ∩ 𝑋 = 𝑥 .  We cannot have an expected value defined for 𝐸|𝑋.  Clarification: 𝐸|𝑋 is not ℙ 𝐸 𝑋 . The latter is numerically valued, as we have previously established for its expected value: 𝔼 ℙ 𝐸 𝑋 = ℙ 𝐸 .  More elaboration: On the set-theory layer, 𝐸|𝑋 is not strictly different from the set-r.v. pair 𝐸, 𝑋 . But when onto the probability-theory layer, ℙ 𝐸 𝑋 is normalized by a different space than is ℙ 𝐸, 𝑋 . 13 𝔼 𝑋𝐸   𝑋|𝐸 is a numerically-valued random variable. We can compute its expected value.  𝔼 𝑋 𝐸 vs 𝔼 𝑋 : their sample spaces are different.  Compute 𝔼 𝑋 𝐸 using ℙ𝐸 = ℙ ⋅ 𝐸 ≔ 𝑋 𝜔 Ω ℙ 𝑑𝜔, 𝐸 ℙ 𝐸 +∞ = ℙ ⋅ ,𝐸) ℙ 𝐸 𝑥𝑓𝑋|𝐸 𝑥 𝑑𝑥 −∞  Compute 𝔼 𝑋 using ℙ +∞ 𝑋 𝜔 ℙ 𝑑𝜔 = Ω 𝑥𝑓𝑋 𝑥 𝑑𝑥 −∞ 14 Warm-up exercise   Handout Problem 2 15 𝔼 𝑌 𝑋 : concepts   First of all, this is a random variable—a function of 𝑋.  Its randomness comes from the state space of 𝑋, but the mapping mechanism is worked out together by both of 𝑋 and 𝑌.  This expression is known as the conditional expectation of the conditionee 𝑌 given the conditioner 𝑋.  The expectation is done with respect to 𝑌.  To be precise, should say w.r.t. 𝑌|𝑋.  There are multiple (or even a continuum of) sample spaces of 𝑌|𝑋, depending on which atom value 𝑋 takes. After fixing 𝑋 to an atom, or equivalently, a block in the state space that has been partitioned by 𝑋, the expression 𝔼 𝑌 𝑋 = 𝑥1 is just a constant.  The expectation eliminates the randomness of 𝑌 given 𝑋. 16 𝔼 𝑌 𝑋 as an r.v.   It uses the joint state space of 𝑋 and 𝑌 as its own state space.  It uses a degenerated version of the sample space of 𝑌 as its own sample space.  The degeneration preserves the locus of the overall center of mass.  Each point in the degenerated space is a block center of mass 17 “Degeneration preserves overall center of mass”   𝑋 cuts its own state space as well as the joint state space of it and 𝑌.  This partition of the joint state space will be mapped by 𝑌 to a partition on its own sample space (a numeral set).  Then the expression 𝔼 𝑌 𝑋 = 𝑥1 represents the locus of center of mass of the first block of the partition.  𝔼 𝑌 𝑋 represents the totality of loci of these block centers of mass. 18 Exercise: Finding 𝔼 𝑌 from 𝔼 𝑌 𝑋   This is the prototypical problem of finding the expectation of a random variable via the technique of conditioning on another random variable.  Ans. 𝔼 𝑌 =𝔼 𝔼 𝑌𝑋  In the divide-conquer-merge paradigm:  Divide is done by the conditioner 𝑋  Conquer refers to the inner expectation carried out at each division  Merge refers to the outer expectation to piece up the whole plate. This exercise addresses the merge step.  Compare with the conditional probability, ponder the link between them. 19 Conditional Variance   Finding variance by conditioning: 𝕍 𝑌 =𝔼 𝕍 𝑌 𝑋 +𝕍 𝔼 𝑌 𝑋  Pf.  Unfortunately, the degeneration of the sample space of 𝒀 does not preserve second moments.  That’s why there is the addendum 𝕍 𝔼 𝑌 𝑋 in the formula. 20 Summary: Conditional Expectation  The key observations are  Obs1: To find the center of mass of a piece of material, you can divide it into a few blocks, find their centers of mass, and then find the center of mass of these block centers of masses. The initial division of the piece is quite arbitrary.  This fundamental law of physics supports the many nice properties of expectation in the calculus of probability.  Obs2: A random variable partitions its state space into a collection of atom-valued blocks.  This suggests using random variable as a general device to divide the piece mentioned in Obs1. Such a random variable is called the conditioner. 21 Linking ℙ 𝐸 𝑋 to 𝔼 𝑌 𝑋   Trick: Use indicator of set 𝐸. The indicator is a Bernoulli random variable.  Reason: ℙ 𝐸 𝑋 ≡ 𝔼 𝐼𝐸 𝑋  Conclusion: The conditional probability of an event conditioned on a random variable (a partition) is a conditional expectation of the indicator of that event conditioned on the same random variable in disguise.  All properties of conditional expectation should apply to conditional probability. Such as the Law of Total Probability is just 𝔼 𝑌 ≡ 𝔼 𝔼 𝑌 𝑋 in disguise. 22 Choosing Conditioner   The art of conditioning lies in the choice of the conditioner.  Usually, if our unknown target is the r.v. 𝑌, and we know that 𝑌 is a known function of a known r.v. 𝑋, then it would be natural to use 𝑋 as the conditioner for 𝑌, that is  Divide the state space of 𝑌 by 𝑋  Conquer every 𝔼 𝑌 𝑋  Merge them into 𝔼 𝑌 23 Exercises   Handout problem 3  Handout problem 4  Handout problem 5  Handout problem 6 24

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Document