Download Document

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
Transcript
Tutorial 8, STAT1301 Fall 2010, 16NOV2010,
MB103@HKU
By Joseph Dong
Recall: A Partition on
a Set
ο‚–
ο‚™ Any exhaustive and disjoint collection of subsets of a
given set forms a partition of that set.
ο‚™ E.g.
ο‚– 𝐡, 𝐡𝑐 forms a trivial partition of the presumed set
𝑆 = 𝐡 βˆͺ 𝐡𝑐 .
ο‚– If 𝑓: 𝑆 β†’ 𝑇 = 1,2,3 , then the collection of pre-images
of atoms of the range, 𝑓 βˆ’1 1 , 𝑓 βˆ’1 2 , 𝑓 βˆ’1 3 , forms a
partition of the domain 𝑆.
2
Recall: Conditioning
on a Partition
ο‚–
ο‚™ Shares the same idea
with
ο‚– Divide and Conquer
ο‚– Casewise enumeration
ο‚– A Tree-diagram
ο‚™ Formal language:
ο‚– Goal = find the
probability of event 𝐸,
β„™ 𝐸 .
ο‚– It is equivalent to
finding the
intersection of it with
the sure event Ξ©.
β„™ 𝐸 ≑ℙ 𝐸∩Ω .
3
Recall: Conditioning
on a Partition (cont’d)
ο‚–
ο‚™ Formal language (continued)
ο‚– Now break down the sure event into a number of
manageable smaller pieces and these pieces together
forms a partition {π΄π‘˜ |π‘˜ ∈ 𝐼} of the sure event Ξ©.
ο‚– If we investigate all such events 𝐸 ∩ π΄π‘˜ , then we’re
done.
β„™ 𝐸 ∩ Ξ© = βˆ‘β„™ 𝐸 ∩ π΄π‘˜
ο‚– The hardcore of the problem now becomes finding
each β„™ 𝐸 ∩ π΄π‘˜ , and this is where the conditioning
takes place.
β„™ 𝐸 π΄π‘˜ β‹… β„™ π΄π‘˜
ο‚™ Assuming it is more straight forward a task to find
β„™ 𝐸 π΄π‘˜ and β„™ π΄π‘˜ .
4
Recall: What does an R.V. do
to its State Space?
ο‚™ An r.v. cuts the state space
into blocks. On each of
these blocks, the r.v. sends
all points there to a
common atom in the
sample space.
ο‚–
ο‚– An r.v. causes a partition
on the state space.
ο‚™ Conversely, given a
partition on the state space,
you can also define random
variables on it so that it
β€œconforms” the partition by
taking one value for each
block.
Random
Variable
Partition
on 𝛀
5
Conditioning an
Event on an R.V.
ο‚–
ο‚™ Since an r.v. cuts the
state space into a
partition, conditioning
on an r.v. is just
conditioning on that
partition it caused on
the state space.
ο‚™ The meaning of β„™ 𝐸 𝑋
is now clearly illustrated
on the right.
6
β„™ 𝐸 𝑋 as a Random Variable
ο‚–
ο‚™ It contains a random variable 𝑋
inside, making itself a function of 𝑋.
ο‚™ It has a distribution and
expectation.
ο‚– Lotus
ο‚– Question: What’s the meaning of its
expected value?
ο‚™ To fix its value by fixing an 𝑋 value:
ο‚– β„™ 𝐸 𝑋 = π‘₯1 , β„™ 𝐸 𝑋 ∈ π‘₯1 , π‘₯2
ο‚– Every fixed value is now a
conditional probability involving
two events.
7
Exercise:
Finding β„™ 𝐸 from β„™ 𝐸 𝑋
ο‚–
ο‚™ This is the prototypical problem of
finding the probability of an event
via the technique of conditioning
on a random variable.
ο‚™ Hint: Ponder on the link
between Law of Total Probability
and Expectation.
ο‚™ Ans:
β„™ 𝐸 =𝔼ℙ 𝐸𝑋
8
β„™ π‘Œπ‘‹
ο‚–
ο‚™ It involves two r.v.’s now.
ο‚™ Given β„™ π‘Œ 𝑋 :
β„™ π‘Œ 𝑋 is a function of the
ο‚™ Q1: How to find β„™ π‘Œ =
bivariate random vector
𝑋, π‘Œ .
ο‚– Fixing 𝑋 will give you
back the conditional
density of π‘Œ given 𝑋 at the
fixed position.
9
Conditional, Marginal, and Joint
densities
ο‚–
ο‚™ Difference among 3 types of densities:
ο‚™ a conditional density β„™ π‘Œ 𝑋
ο‚™ is normalized by the marginal probability of β„™ 𝑋
ο‚™ is a point dividing a row sum/integral
ο‚™ is the density of π‘Œ|𝑋
ο‚™ a joint density β„™ 𝑋, π‘Œ
ο‚™ is normalized by the entire joint space
ο‚™ is a point dividing the sum/integral of entire space
ο‚™ is the density of 𝑋, π‘Œ
ο‚™ a marginal density β„™ π‘Œ
ο‚™ is also normalized by the entire space
ο‚™ is a row sum dividing the sum/integral of entire space
ο‚™ is the density of π‘Œ
10
Handout Problem 1
ο‚–
11
Recall: What’s the Expectation
of a random variable
ο‚–
ο‚™ First of all, the random variable has to be numerically valued.
That’s why expectation is also known as the β€œexpected value”
and is a numerical characteristic of the sample space (a subset
of ℝ or simply ℝ itself with zero densities equipped at those
impossible points).
+∞
𝔼 𝑋 =
π‘₯𝑓𝑋 π‘₯ 𝑑π‘₯
βˆ’βˆž
ο‚™ The expectation is both conceptually and technically equivalent
to the location of the center of probability mass of the sample
space.
ο‚– Expectation provides only partial information of the random
variable because it eliminates randomness by giving you back
only 1 representative point of the sample space.
12
For examples,
ο‚–
ο‚– 𝐸|𝑋 is a set-valued random variable.
ο‚™ Given 𝑋 = π‘₯, it evaluates to the set 𝐸 ∩ 𝑋 = π‘₯ .
ο‚™ We cannot have an expected value defined for 𝐸|𝑋.
ο‚™ Clarification: 𝐸|𝑋 is not β„™ 𝐸 𝑋 . The latter is numerically
valued, as we have previously established for its expected
value: 𝔼 β„™ 𝐸 𝑋 = β„™ 𝐸 .
ο‚– More elaboration: On the set-theory layer, 𝐸|𝑋 is not
strictly different from the set-r.v. pair 𝐸, 𝑋 . But when
onto the probability-theory layer, β„™ 𝐸 𝑋 is
normalized by a different space than is β„™ 𝐸, 𝑋 .
13
𝔼 𝑋𝐸
ο‚–
ο‚– 𝑋|𝐸 is a numerically-valued random variable. We can
compute its expected value.
ο‚– 𝔼 𝑋 𝐸 vs 𝔼 𝑋 : their sample spaces are different.
ο‚™ Compute 𝔼 𝑋 𝐸 using ℙ𝐸 = β„™ β‹… 𝐸 ≔
𝑋 πœ”
Ξ©
β„™ π‘‘πœ”, 𝐸
β„™ 𝐸
+∞
=
β„™ β‹… ,𝐸)
β„™ 𝐸
π‘₯𝑓𝑋|𝐸 π‘₯ 𝑑π‘₯
βˆ’βˆž
ο‚™ Compute 𝔼 𝑋 using β„™
+∞
𝑋 πœ” β„™ π‘‘πœ” =
Ξ©
π‘₯𝑓𝑋 π‘₯ 𝑑π‘₯
βˆ’βˆž
14
Warm-up exercise
ο‚–
ο‚™ Handout Problem 2
15
𝔼 π‘Œ 𝑋 : concepts
ο‚–
ο‚™ First of all, this is a random variableβ€”a function of 𝑋.
ο‚™ Its randomness comes from the state space of 𝑋, but the
mapping mechanism is worked out together by both of 𝑋
and π‘Œ.
ο‚™ This expression is known as the conditional expectation of
the conditionee π‘Œ given the conditioner 𝑋.
ο‚™ The expectation is done with respect to π‘Œ.
ο‚– To be precise, should say w.r.t. π‘Œ|𝑋.
ο‚– There are multiple (or even a continuum of) sample spaces
of π‘Œ|𝑋, depending on which atom value 𝑋 takes. After fixing
𝑋 to an atom, or equivalently, a block in the state space that
has been partitioned by 𝑋, the expression 𝔼 π‘Œ 𝑋 = π‘₯1 is just
a constant.
ο‚– The expectation eliminates the randomness of π‘Œ given 𝑋.
16
𝔼 π‘Œ 𝑋 as an r.v.
ο‚–
ο‚™ It uses the joint state space
of 𝑋 and π‘Œ as its own state
space.
ο‚™ It uses a degenerated
version of the sample space
of π‘Œ as its own sample
space.
ο‚– The degeneration preserves
the locus of the overall
center of mass.
ο‚– Each point in the
degenerated space is a
block center of mass
17
β€œDegeneration preserves
overall center of mass”
ο‚–
ο‚™ 𝑋 cuts its own state space as
well as the joint state space
of it and π‘Œ.
ο‚™ This partition of the joint
state space will be mapped
by π‘Œ to a partition on its
own sample space (a
numeral set).
ο‚™ Then the expression
𝔼 π‘Œ 𝑋 = π‘₯1 represents the
locus of center of mass of the
first block of the partition.
ο‚™ 𝔼 π‘Œ 𝑋 represents the
totality of loci of these block
centers of mass.
18
Exercise:
Finding 𝔼 π‘Œ from 𝔼 π‘Œ 𝑋
ο‚–
ο‚™ This is the prototypical problem of finding the expectation of a random
variable via the technique of conditioning on another random variable.
ο‚™ Ans.
𝔼 π‘Œ =𝔼 𝔼 π‘Œπ‘‹
ο‚™ In the divide-conquer-merge paradigm:
ο‚– Divide is done by the conditioner 𝑋
ο‚– Conquer refers to the inner expectation carried out at each
division
ο‚– Merge refers to the outer expectation to piece up the whole
plate. This exercise addresses the merge step.
ο‚™ Compare with the conditional probability, ponder the link
between them.
19
Conditional Variance
ο‚–
ο‚™ Finding variance by
conditioning:
𝕍 π‘Œ
=𝔼 𝕍 π‘Œ 𝑋 +𝕍 𝔼 π‘Œ 𝑋
ο‚™ Pf.
ο‚™ Unfortunately, the
degeneration of the
sample space of 𝒀 does
not preserve second
moments.
ο‚™ That’s why there is the
addendum 𝕍 𝔼 π‘Œ 𝑋
in the formula.
20
Summary:
Conditional Expectation
ο‚–
The key observations are
ο‚™ Obs1: To find the center of mass of a piece of material,
you can divide it into a few blocks, find their centers of
mass, and then find the center of mass of these block
centers of masses. The initial division of the piece is quite
arbitrary.
ο‚– This fundamental law of physics supports the many nice
properties of expectation in the calculus of probability.
ο‚™ Obs2: A random variable partitions its state space into a
collection of atom-valued blocks.
ο‚– This suggests using random variable as a general device to
divide the piece mentioned in Obs1. Such a random
variable is called the conditioner.
21
Linking β„™ 𝐸 𝑋 to 𝔼 π‘Œ 𝑋
ο‚–
ο‚™ Trick: Use indicator of set 𝐸. The indicator is a Bernoulli
random variable.
ο‚™ Reason: β„™ 𝐸 𝑋 ≑ 𝔼 𝐼𝐸 𝑋
ο‚™ Conclusion: The conditional probability of an event
conditioned on a random variable (a partition) is a
conditional expectation of the indicator of that event
conditioned on the same random variable in disguise.
ο‚™ All properties of conditional expectation should apply to
conditional probability. Such as the Law of Total
Probability is just 𝔼 π‘Œ ≑ 𝔼 𝔼 π‘Œ 𝑋 in disguise.
22
Choosing Conditioner
ο‚–
ο‚™ The art of conditioning lies in the choice of the
conditioner.
ο‚™ Usually, if our unknown target is the r.v. π‘Œ, and we
know that π‘Œ is a known function of a known r.v. 𝑋,
then it would be natural to use 𝑋 as the conditioner
for π‘Œ, that is
ο‚– Divide the state space of π‘Œ by 𝑋
ο‚– Conquer every 𝔼 π‘Œ 𝑋
ο‚– Merge them into 𝔼 π‘Œ
23
Exercises
ο‚–
ο‚™ Handout problem 3
ο‚™ Handout problem 4
ο‚™ Handout problem 5
ο‚™ Handout problem 6
24