Download Joint distribution, and independence

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
Probability & Statistics
Professor Wei Zhu
July 22nd
(1) Letโ€™s have some fun: A Fair Game or Not?
A gambler goes to bet. The dealer has 3 dice, which are fair, meaning that the
chance that each face shows up is exactly 1/6.
The dealer says: "You can choose your bet on a number, any number from 1 to
6. Then I'll roll the 3 dice. If none show the number you bet, you'll lose $1. If
one shows the number you bet, you'll win $1. If two or three dice show the
number you bet, you'll win $3 or $5, respectively."
Is this a fair game? (*A fair game is such that the expected, or equivalently the
long term average, winning is zero.)
๐Ÿ
Solution: Let ๐‘ฟ be # of dice show the number you bet, then ๐‘ฟ~๐‘ฉ๐’Š๐’๐’๐’Ž๐’Š๐’‚๐’(๐Ÿ‘, ๐Ÿ”)
(Because the dealer roll 3 dice independently, and the probability for each
๐Ÿ
dice show the number you bet is ๐Ÿ”. That is,
๐’™
๐Ÿ“ ๐Ÿ‘โˆ’๐’™
๐Ÿ‘ ๐Ÿ
๐‘ท(๐‘ฟ = ๐’™) = ( ) ( ) ( )
, ๐’™ = ๐ŸŽ, ๐Ÿ, ๐Ÿ, ๐Ÿ‘
๐’™ ๐Ÿ”
๐Ÿ”
Let ๐’€ be the amount of money you get, then,
๐ŸŽ
๐Ÿ“ ๐Ÿ‘โˆ’๐ŸŽ
๐Ÿ“ ๐Ÿ‘ ๐Ÿ๐Ÿ๐Ÿ“
๐Ÿ‘ ๐Ÿ
๐‘ท(๐’€ = โˆ’๐Ÿ) = ๐‘ท(๐‘ฟ = ๐ŸŽ) = ( ) ( ) ( )
=( ) =
๐ŸŽ ๐Ÿ”
๐Ÿ”
๐Ÿ”
๐Ÿ๐Ÿ๐Ÿ”
๐Ÿ
๐Ÿ“ ๐Ÿ‘โˆ’๐Ÿ
๐Ÿ ๐Ÿ“ ๐Ÿ
๐Ÿ•๐Ÿ“
๐Ÿ‘ ๐Ÿ
๐‘ท(๐’€ = ๐Ÿ) = ๐‘ท(๐‘ฟ = ๐Ÿ) = ( ) ( ) ( )
=๐Ÿ‘โˆ™ โˆ™( ) =
๐Ÿ ๐Ÿ”
๐Ÿ”
๐Ÿ” ๐Ÿ”
๐Ÿ๐Ÿ๐Ÿ”
๐Ÿ
๐Ÿ“ ๐Ÿ‘โˆ’๐Ÿ
๐Ÿ ๐Ÿ ๐Ÿ“
๐Ÿ๐Ÿ“
๐Ÿ‘ ๐Ÿ
๐‘ท(๐’€ = ๐Ÿ‘) = ๐‘ท(๐‘ฟ = ๐Ÿ) = ( ) ( ) ( )
=๐Ÿ‘โˆ™( ) โˆ™ =
๐Ÿ ๐Ÿ”
๐Ÿ”
๐Ÿ”
๐Ÿ” ๐Ÿ๐Ÿ๐Ÿ”
๐Ÿ‘
๐Ÿ“ ๐Ÿ‘โˆ’๐Ÿ‘
๐Ÿ ๐Ÿ‘
๐Ÿ
๐Ÿ‘ ๐Ÿ
๐‘ท(๐’€ = ๐Ÿ“) = ๐‘ท(๐‘ฟ = ๐Ÿ‘) = ( ) ( ) ( )
=( ) =
๐Ÿ‘ ๐Ÿ”
๐Ÿ”
๐Ÿ”
๐Ÿ๐Ÿ๐Ÿ”
Thus, your expected winning is
๐‘ฌ(๐’€) = โˆ‘ ๐’š โˆ™ ๐‘ท(๐’€ = ๐’š) = (โˆ’๐Ÿ) โˆ™
๐’š
๐Ÿ๐Ÿ๐Ÿ“
๐Ÿ•๐Ÿ“
๐Ÿ๐Ÿ“
๐Ÿ
+๐Ÿโˆ™
+๐Ÿ‘โˆ™
+๐Ÿ“โˆ™
=๐ŸŽ
๐Ÿ๐Ÿ๐Ÿ”
๐Ÿ๐Ÿ๐Ÿ”
๐Ÿ๐Ÿ๐Ÿ”
๐Ÿ๐Ÿ๐Ÿ”
Therefore, it is a fair game.
1
(2) *Review: The derivative and integral of some important functions one
should remember.
๐‘‘ ๐‘˜
(๐‘ฅ ) = ๐‘˜๐‘ฅ ๐‘˜โˆ’1
๐‘‘๐‘ฅ
๐‘‘ ๐‘ฅ
(๐‘’ ) = ๐‘’ ๐‘ฅ
๐‘‘๐‘ฅ
๐‘‘
1
(ln ๐‘ฅ) =
๐‘‘๐‘ฅ
๐‘ฅ
๐‘ฅ=๐‘
๐‘
1
1
โˆซ ๐‘ฅ ๐‘‘๐‘ฅ = (
๐‘ฅ ๐‘˜+1 )|
=
(๐‘ ๐‘˜+1 โˆ’ ๐‘Ž๐‘˜+1 )
๐‘˜
+
1
๐‘˜
+
1
๐‘Ž
๐‘ฅ=๐‘Ž
๐‘˜
๐‘ฅ=๐‘
๐‘
๐‘ฅ
๐‘ฅ
= ๐‘’๐‘ โˆ’ ๐‘’๐‘Ž
โˆซ ๐‘’ ๐‘‘๐‘ฅ = ๐‘’ |
๐‘Ž
๐‘ฅ=๐‘Ž
*The Chain Rule
๐‘‘
๐‘”[๐‘“(๐‘ฅ)] = ๐‘”โ€ฒ[๐‘“(๐‘ฅ)] โˆ™ ๐‘“โ€ฒ(๐‘ฅ)
๐‘‘๐‘ฅ
๐‘‘
2
2
For example: ๐‘‘๐‘ฅ (๐‘’ ๐‘ฅ ) = ๐‘’ ๐‘ฅ โˆ™ 2๐‘ฅ
*The Product Rule
๐‘‘
[๐‘”(๐‘ฅ) โˆ™ ๐‘“(๐‘ฅ)] = ๐‘”โ€ฒ (๐‘ฅ)๐‘“(๐‘ฅ) + ๐‘”(๐‘ฅ)๐‘“โ€ฒ(๐‘ฅ)
๐‘‘๐‘ฅ
*To Find the Maximum/Minimum of a (continuous & differentiable) Function
e. g. g(๐‘ฅ) = (๐‘ฅ โˆ’ 1)2 , ๐‘ฅ โˆˆ [0, 2]
We first set the first derivative to zero, find the solutions, then compare to the
boundary values. This shall yield the minimum and maximum. We see that
๐‘‘
๐‘‘๐‘ฅ
g(๐‘ฅ) = 2(๐‘ฅ โˆ’ 1) = 0 โ†’ ๐‘ฅ = 1.
Soon we realize that at this point we have the minimum, while the maximum is
achieved at the two boundary points.
2
(3)
Normal Distribution
Q. Who invented the normal distribution?
* Left: Abraham de Moivre (26 May 1667 in Vitry-le-François, Champagne,
France โ€“ 27 November 1754 in London, England) *From the Wikipedia
* Right: Johann Carl Friedrich Gauss (30 April 1777 โ€“ 23 February 1855)
<i> Probability Density Function (p.d.f.)
X ~ N ( ๏ญ , ๏ณ 2 ) : X follows normal distribution of mean ๏ญ and variance ๏ณ 2
f ( x) ๏€ฝ
1
2๏ฐ ๏ณ
๏€ญ
e
( x๏€ญ๏ญ )2
2๏ณ 2
, ๏€ญ ๏‚ฅ ๏ฐ x ๏ฐ ๏‚ฅ, x ๏ƒŽ R
3
b
P(a ๏‚ฃ X ๏‚ฃ b) ๏€ฝ ๏ƒฒ f ( x)dx = area under the pdf curve bounded by a and b
a
<ii> Cumulative Distribution Function (c.d.f.)
x
F ( x) ๏€ฝ P( X ๏‚ฃ x) ๏€ฝ ๏ƒฒ f (t )dt
๏€ญ๏‚ฅ
f ( x) ๏€ฝ [ F ( x)]' ๏€ฝ
d
F ( x)
dx
(4). Mathematical Expectation (Review).
๏‚ฅ
Continuous random variable:
E[ g ( X )] ๏€ฝ
๏ƒฒ g ( x) f ( x)dx
๏€ญ๏‚ฅ
Discrete random variable:
all
E[ g ( X )] ๏€ฝ ๏ƒฅ g ( x) P( X ๏€ฝ x)
Properties of Expectations:
(1) E(c) = c, where c is a constant
4
(2) E[c*g(X)] = c*E[g(X)], where c is a constant
(3) E[g(X)+h(Y))]= E[g(X)]+E[h(Y)], for any X&Y
(4) E[g(X)*h(Y)] = E[g(X)]*E[h(Y)], if X & Y are independent โ€“otherwise it is usually
not true.
Special case:
๏‚ฅ
1) (population) Mean: ๏ญ ๏€ฝ E ( X ) ๏€ฝ ๏ƒฒ x ๏ƒ— f ( x)dx
๏€ญ๏‚ฅ
Note: E(aX+b) =aE(X)+b, where a & b are constants
2) (population) Variance:
๏‚ฅ
Var(X) = ๏ณ 2 ๏€ฝ E[( X ๏€ญ ๏ญ )2 ] ๏€ฝ ๏ƒฒ ( x ๏€ญ ๏ญ )2 ๏ƒ— f ( x)dx ๏€ฝ E( X 2 ) ๏€ญ [ E( X )]2
๏€ญ๏‚ฅ
Note: Var(aX+b) = a2Var(X), where a & b are constants
3) Moment generating function:
๏‚ฅ
M X (t ) ๏€ฝ E (e tX ) ๏€ฝ ๏ƒฒ e tx f ( x)dx , when X is continuous.
๏€ญ๏‚ฅ
๐‘ด๐‘ฟ (๐’•) = ๐„(๐ž๐ญ๐— ) = โˆ‘๐š๐ฅ๐ฅ ๐ฉ๐จ๐ฌ๐ฌ๐ข๐›๐ฅ๐ž ๐ฏ๐š๐ฅ๐ฎ๐ž๐ฌ ๐จ๐Ÿ ๐— ๐ž๐ญ๐ฑ ๐Ÿ(๐ฑ)
For normal distribution, X ~ N (๏ญ , ๏ณ ), f ( x) ๏€ฝ
2
๏‚ฅ
M X (t ) ๏€ฝ ๏ƒฒ e tx f ( x)dx ๏€ฝ e
1
2๏ฐ ๏ณ
๏€ญ
e
( x๏€ญ๏ญ )2
2๏ณ 2
, ๏€ญ๏‚ฅ ๏€ผ x ๏€ผ ๏‚ฅ
1
2
๏ญt ๏€ซ ๏ณ 2t 2
๏€ญ๏‚ฅ
4) Moment:
1st (population) moment: ๐ธ(๐‘‹) = โˆซ ๐‘ฅ โˆ™ ๐‘“(๐‘ฅ) ๐‘‘๐‘ฅ
2nd (population) moment: ๐ธ(๐‘‹ 2 ) = โˆซ ๐‘ฅ 2 โˆ™ ๐‘“(๐‘ฅ) ๐‘‘๐‘ฅ
โ€ฆ
Kth (population) moment: ๐ธ(๐‘‹ ๐‘˜ ) = โˆซ ๐‘ฅ ๐‘˜ โˆ™ ๐‘“(๐‘ฅ) ๐‘‘๐‘ฅ
5
Theorem. If X 1 , X 2 are independent, then
M X 1 ๏€ซ X 2 (t ) ๏€ฝ M X1 (t ) M X 2 (t )
Theorem. Under regularity conditions, there is 1-1 correspondence between
the pdf and the mgf of a given random variable X. That is,
1๏€ญ1
pdf f ( x) ๏‚ฌ๏‚พ๏‚ฎ
mgf M X (t ) .
e.g. Sgt. Jones wishes to select one army recruit into his unit. It is known that the IQ
distribution for all recruits is normal with mean 180 and standard deviation 10.
What is the chance that Sgt. Jones would select a recruit with an IQ of at least 200?
Sol) Let X represents the IQ of a randomly selected recruit.
X ~ N ( ๏ญ ๏€ฝ 180, ๏ณ ๏€ฝ 10)
P( X ๏‚ณ 200) ๏€ฝ P(
X ๏€ญ 180 200 ๏€ญ 180
๏‚ณ
) ๏€ฝ P( Z ๏‚ณ 2) ๏‚ป 2.28%
10
10
e.g. Sgt. Jones wishes to select three army recruits into his unit. It is known that the
IQ distribution for all recruits is normal with mean 180 and standard deviation 10.
What is the chance that Sgt. Jones would select three recruits with an average IQ of
at least 200?
Sol) Let ๐‘‹1, ๐‘‹2 , ๐‘‹3 represents the IQ of three randomly selected recruits.
๐‘‹๐‘–
๐‘–.๐‘–.๐‘‘.
~
๐‘(๐œ‡, ๐œŽ 2 ),
๐‘– = 1, 2, 3
X ~ N ( ๏ญ * ๏€ฝ 180,๏ณ * ๏€ฝ
P( X ๏‚ณ 200) ๏€ฝ P(
10
)
3
X ๏€ญ 180 200 ๏€ญ 180
๏‚ณ
) ๏€ฝ P( Z ๏‚ณ 2 3 ) ๏‚ป 0.03%
10 / 3
10 / 3
6
*(5). Joint distribution, and independence
Definition. The joint cdf of two random variables X and Y are defined as:
๐น๐‘‹,๐‘Œ (๐‘ฅ, ๐‘ฆ) = ๐น(๐‘ฅ, ๐‘ฆ) = ๐‘ƒ(๐‘‹ โ‰ค ๐‘ฅ, ๐‘Œ โ‰ค ๐‘ฆ)
Definition. The joint pdf of two discrete random variables X and Y are defined as:
๐‘“๐‘‹,๐‘Œ (๐‘ฅ, ๐‘ฆ) = ๐‘“(๐‘ฅ, ๐‘ฆ) = ๐‘ƒ(๐‘‹ = ๐‘ฅ, ๐‘Œ = ๐‘ฆ)
Definition. The joint pdf of two continuous random variables X and Y are defined
as:
๐‘“๐‘‹,๐‘Œ (๐‘ฅ, ๐‘ฆ) = ๐‘“(๐‘ฅ, ๐‘ฆ) =
๐œ•2
๐น(๐‘ฅ, ๐‘ฆ)
๐œ•๐‘ฅ๐œ•๐‘ฆ
Definition. The marginal pdf of the discrete random variable X or Y can be obtained
by summation of their joint pdf as the following:
๐‘“๐‘‹ (๐‘ฅ) = โˆ‘๐‘ฆ ๐‘“(๐‘ฅ, ๐‘ฆ) ; ๐‘“๐‘Œ (๐‘ฆ) = โˆ‘๐‘ฅ ๐‘“(๐‘ฅ, ๐‘ฆ) ;
Definition. The marginal pdf of the continuous random variable X or Y can be
obtained by integration of the joint pdf as the following:
โˆž
โˆž
๐‘“๐‘‹ (๐‘ฅ) = โˆซโˆ’โˆž ๐‘“(๐‘ฅ, ๐‘ฆ) ๐‘‘๐‘ฆ; ๐‘“๐‘Œ (๐‘ฆ) = โˆซโˆ’โˆž ๐‘“(๐‘ฅ, ๐‘ฆ) ๐‘‘๐‘ฅ;
Definition. The conditional pdf of a random variable X or Y is defined as:
๐‘“(๐‘ฅ|๐‘ฆ) =
๐‘“(๐‘ฅ, ๐‘ฆ)
๐‘“(๐‘ฅ, ๐‘ฆ)
; ๐‘“(๐‘ฆ|๐‘ฅ) =
๐‘“(๐‘ฆ)
๐‘“(๐‘ฅ)
Definition. The joint moment generating function of two random variables X and Y
is defined as
๐‘€๐‘‹,๐‘Œ (๐‘ก1 , ๐‘ก2 ) = ๐ธ(๐‘’ ๐‘ก1 ๐‘‹+๐‘ก2 ๐‘Œ )
7
Note that we can obtain the marginal mgf for X or Y as follows:
๐‘€๐‘‹ (๐‘ก1 ) = ๐‘€๐‘‹,๐‘Œ (๐‘ก1 , 0) = ๐ธ(๐‘’ ๐‘ก1 ๐‘‹+0โˆ—๐‘Œ ) = ๐ธ(๐‘’ ๐‘ก1 ๐‘‹ ); ๐‘€๐‘Œ (๐‘ก2 ) = ๐‘€๐‘‹,๐‘Œ (0, ๐‘ก2 ) = ๐ธ(๐‘’ 0โˆ—๐‘‹+๐‘ก2 โˆ—๐‘Œ )
= ๐ธ(๐‘’ ๐‘ก2 โˆ—๐‘Œ )
Theorem. Two random variables X and Y are independent โ‡” (if and only if)
๐น๐‘‹,๐‘Œ (๐‘ฅ, ๐‘ฆ) = ๐น๐‘‹ (๐‘ฅ)๐น๐‘Œ (๐‘ฆ) โ‡” ๐‘“๐‘‹,๐‘Œ (๐‘ฅ, ๐‘ฆ) = ๐‘“๐‘‹ (๐‘ฅ)๐‘“๐‘Œ (๐‘ฆ) โ‡” ๐‘€๐‘‹,๐‘Œ (๐‘ก1 , ๐‘ก2 ) = ๐‘€๐‘‹ (๐‘ก1 ) ๐‘€๐‘Œ (๐‘ก2 )
Definition. The covariance of two random variables X and Y is defined as
๐ถ๐‘‚๐‘‰(๐‘‹, ๐‘Œ) = ๐ธ[(๐‘‹ โˆ’ ๐œ‡๐‘‹ )(๐‘Œ โˆ’ ๐œ‡๐‘Œ )].
Theorem. If two random variables X and Y are independent, then we have
๐ถ๐‘‚๐‘‰(๐‘‹, ๐‘Œ) = 0. (*Note: However, ๐ถ๐‘‚๐‘‰(๐‘‹, ๐‘Œ) = 0 does not necessarily mean that X
and Y are independent.)
Homework #2, Due Friday, July 25, before our lecture
๐๐Ÿ. ๐ฟ๐‘’๐‘ก ๐‘‹1 & ๐‘‹2 ๐‘๐‘’ ๐‘–๐‘›๐‘‘๐‘’๐‘๐‘’๐‘›๐‘‘๐‘’๐‘›๐‘ก ๐‘(๐œ‡, ๐œŽ 2 ). Prove that ๐‘‹1 + ๐‘‹2 and ๐‘‹1 โˆ’ ๐‘‹2 are
independent.
Q2. ๐ฟ๐‘’๐‘ก ๐‘ ๐‘๐‘’ ๐‘(0,1), (1) please derive the covariance between ๐‘ ๐‘Ž๐‘›๐‘‘ ๐‘ 2 ; (2) Are
๐‘ ๐‘Ž๐‘›๐‘‘ ๐‘ 2 independent?
Q3. Let ~๐‘(3, 4), please calculate ๐‘ƒ(1 < ๐‘‹ < 3)
Q4. Jack & Jill plan to meet at the gate of the famous Los Angeles Chinese Theater
(LACT) between 12noon and 1pm tomorrow. The one who arrives first will wait for
the other for 30 minutes, and then leave. What is the chance that the two friends will
be able to meet at LACT during their appointed time period assuming that each one
will arrive at LACT independently, each at a random time between 12noon and 1pm?
8