Download appendixb

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Fundamentals of Probability
Wooldridge, Appendix B
B.1 Random Variables and Their
Probability Distributions

A random variable (rv) is one that takes on
numerical values and has an outcome that is
determined by an experiment

Examples of an rv


Flipping a coin, and let X be the outcome of head (X = 1) or
tail (X = 0). This (X) is a Bernoulli (binary) rv
Flipping two coins, and let X be the number of heads
Discrete Random Variable


Defn. (Discrete rv). A discrete rv is one that takes
only a finite or countably infinite number of
values, i.e., one with a space that is either finite
or countable.
Defn. (Probability mass function). Let X be a
discrete rv. The probability mass function (pmf)
of X is given by
pj = P(X = xj), j = 1, 2, ...,k,
Discrete Random Variable:
Properties

Properties of a pmf



0  pj  1 for all j
p1+p2+...+pk = 1
Example. Flipping 2 coins, and let X be the
number of heads. Then, the pmf of X is
xj
pj
0
¼
1
½
2
¼
Continuous Random Variable

Defn. (Continuous Random Variable). We say a
random variable is a continuous random variable
if its cumulative distribution function FX(x) is a
continuous function for all x  R. For a
continuous rv there are no points of discrete
mass; that is, if X is continuous then
P(X = x) = 0 for all x  R.
Continuous Random Variable
For continuous rv's,
.
FX ( x) =
ò
x
- ¥
f X (t )dt
for some function f X (t ). The function f X ( x) is called a
probability density function (pdf) of X . If f X ( x) is also
continuous, then the Fundamental Theorem of Probability
implies that
d
FX ( x) = f X ( x)
dx
Continuous Random Variable

If X is a continuous rv, then the probabilities can
be obtained by integration, i.e.,
P(a < X £ b) = FX (b) - FX (a ) =
ò
b
a
f X (t )dt
Expectations of a Random Variable
Defn. (Expectation). If X is a continuous rv with pdf f ( x) and
ò
¥
- ¥
| x | f ( x) dx < ¥
then the expectation (expected value) of X is
E( X ) =
ò
¥
- ¥
x f ( x) dx
If X is a discrete rv with pmf p( x) and
å
| x | p( x) < ¥
x
then the expectation (expected value) of X is
E( X ) =
å
x
x p( x)
Expectations: Properties
For any constant c,
E (c) = c.
For any constants a and b,
E (aX + b) = aE ( X ) + b
Given constants a1 , a2 ,..., an and random variables X 1 , X 2 ,..., X n ,
æn
ö
÷
E ççå ai X i ÷
= E (a1 X 1 + a2 X 2 + ... + an X n )
÷
çè i= 1
ø
= a1 E ( X 1 ) + a2 E ( X 2 ) + ... + an E ( X n )
n
=
å
i= 1
ai E ( X i )
Special Expectation: Variance and
Standard Deviation (Measures of
Variability)
The variance of a random variable (discrete or continuous),
denoted Var(X ) or 2 , is defined as
Var( X ) º 2 = E[( X - ) 2 ]
which is equivalent to
2 = E ( X 2 - 2 X  +  2 ) = E ( X 2 ) - 2E ( X ) + E ( 2 )
= E ( X 2 ) - 2 2 +  2 = E ( X 2 ) -  2
Û E ( X 2 ) = 2 +  2
(used in proof of unbiasedness of sample variance s 2 = ( x - x ) 2 / (n - 1);
see Simple_OLS_inference.pdf , Appendix B, Lemma 6)
Variance & Standard Deviation:
Examples
Consider Bernoulli distribution X ~ Bernoulli(), with pmf
P ( x) =  x (1 - )1- x , x = 0,1
Then,
E( X ) =  º
å
x p ( x)= (0) 0 (1 - )1- 0 + (1) 1 (1 - )1- 1 = 
x
E( X 2 ) = E( X ) = 
\
 2 = E ( X 2 ) -  2 =  - 2 = (1 - )
Variance: Properties
Variance of a constant c
Var (c) = 0
For any constants a and b,
Var (aX + b) = a 2Var ( X )
Variance: Properties
Variance of a linear function,
æn
ö
÷
ç
Var çå ai X i ÷
= Var (a1 X 1 + a2 X 2 + .... + an X n )
÷
çè i= 1
ø
= a12Var ( X 1 ) + a22Var ( X 2 ) + ... + an2Var ( X n )
+ å 2ai a j Cov( xi , x j )
i> j
n
=
å
ai2Var ( X i ) if the xi ' s are uncorrelated,
i= 1
i.e., if Cov( xi , x j ) = 0 for all i ¹ j
Standard Deviation

The standard deviation of a random variable is the
squared root of its variance.
=
2
Standardizing a Random Variable
Property. If X ~ (,  2 ), then the standardized random variable
Z º ( X - ) /  ~ (0,1)
Proof.
æX -  ö 1
1
÷
ç
E (Z ) = E ç
= E ( X - ) = [ E ( X ) - E ()]
÷
÷
çè  ø 

1
= ( - ) = 0

æX -  ö 1
1
÷
ç
Var ( Z ) = Var ç
= 2 Var ( X - ) = 2 Var ( X )
÷
çè  ÷
ø 

2
= 2=1

B.4 Joint and Conditional Distributions
Covariance
Cov( X , Y ) º E[( X -  X )(Y - Y )]
= E[( X -  X )Y ] = E[ X (Y - Y )]
= E ( XY ) -  X Y
Properties
1. If X and Y are independent, then Cov( X , Y ) = 0.
2. For any constants a1 , b1 , a2 , and b2 ,
Cov(a1 X + b1 , a2Y + b2 ) = a1a2Cov ( X , Y )
B.4 Joint and Conditional Distributions
3. Cauchy -Schwartz inequality
For any rv's X , Y ,
3.1
3.2
[ E ( XY )]2 £ E ( X 2 ) E (Y 2 )
2
{E[ X - E ( X )][Y - E (Y )]}
£ E[ X - E ( X )]2 E[Y - E (Y )]2
3.3
| Cov( X , Y ) | £ sd ( X ) sd (Y )
[Note that 3.3 is equivalent to 3.1; why?]
B.4 Joint and Conditional Distributions
Correlation
Cov( X , Y )
 XY
Corr ( X , Y ) =
=
sd ( X ) sd (Y )  X Y
Properties (correlation)
1. - 1 £ Corr ( X , Y ) £ 1 (Follows from C-S inequality; why?)
2. For constants a1 , b1 , a2 , and b2 ,
Corr (a1 X + b1 , a2Y + b2 ) = Corr ( X , Y ) if a1a2 > 0
Corr (a1 X + b1 , a2Y + b2 ) = - Corr ( X , Y ) if a1a2 < 0
Properties (variance)
3. For constant a and b,
Var (aX + bY ) = a 2Var ( X ) + b 2Var (Y ) + 2abCov( X , Y )
B.5 The Normal and Related
Distributions (Chi-square, F, t)
Defn.
(Normal (Gaussian) Distribution). The pdf for the normal
.
random var iable X ~ N (, 2 ) is
f ( x) =
1
f ( z) =
1
2
1æ
x-  ö÷
ç
- ç
÷
2 çè  ø÷
e
, - ¥ < x< ¥
2
where  = E ( X ) and 2 = Var ( X ).
Defn. (Standard Normal Distribution). The standard normal distribution
is a special case of the normal distribution when  = 0 and  = 1. The pdf
for the s tan dard normal random var iable, denoted Z ~ N (0,1), is
2
-
e
1 2
z
2
, - ¥ < z< ¥
B.5 The Normal and Related
Distributions (Chi-square, F, t)
Draw the graphs to show:
P( Z > z ) = 1- ( z )
P( Z < - z ) = P( Z > z )
P (a £ Z £ b) =  (b) - (a)
P (| Z |> c) = P( Z > c or Z < - c) = P ( Z > c) + P ( Z < - c)
= 2 P( Z > c) = 2[1 - (c)]
Using Excel's function:
P ( Z £ 1.96) = normsdist(1.96) = 0.975
Normal Distribution: Properties
.
Property (standardizing the normal random variable)
X- 
2
If X ~ N (,  ), then Z =
~ N (0,1)

Exercises
Let X ~ N (18, 4). Find
1. P( X £ 16)
2. P( X £ 20)
3. P( X ³ 16)
4. P( X ³ 20)
5. P(16 £ X £ 20)
Standard Normal Table
Normal Distribution: Additional
Properties
.1. If X ~ N (, 2 ), then aX + b ~ N ( + b, a 22 )
2. If X and Y are jointly normally distributed, then
they are independent if, and only if, Cov( X , Y ) = 0.
3. If Y1 , Y2 ,..., Yn are indendent random variables and
each is distributed as N (, 2 ), or, drawing independent
random sample of size n from N (, 2 ), then the statistic
Y ~ N (,  2 / n)
The Chi-Square Distribution
Defn. (Chi-square Statistic/Distribution). Let Z i , i = 1, 2,..., n
.be independent random variables, each distributed as the
standard normal, that is
Z i ~ N (0,1)
Then,
n
X=
å
Z i2 ~  n2
i= 1
is Chi-square distributed with n degrees of freedom.
Moments of the Chi-square distribution
E ( X ) = n, Var ( X ) = 2n.
The Student’s t Distribution
Defn. (Student's t Statistic/Distribution). Let Z have a
.standard normal distribution and X a Chi-square distribution
with n degrees of freedom. That is,
Z ~ N (0,1),
X ~  2n
Further, assume Z and X are independent. Then, the random variable
Z
t=
~ tn
X /n
is distributed as Student's t distribution with n degrees of freedom:
Moments of the t distribution
E (t ) = 0
for n > 1
Var (t ) = n /(n - 2) for n > 2
The F Distribution
Defn. (F Statistic/Distribution). Let
.
X 1 ~  2k1
X 2 ~  2k2
and assume that X 1 and X 2 are independent. Then, the
random variable
X 1 / k1
F=
~ Fk1 ,k2
X 2 / k2
is F -distributed with k1 and k2 degrees of freedom.
Related documents