Download Chapter 5

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Chapter 5
Joint Probability
Distributions and
Random Samples
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
5.1
Jointly Distributed
Random Variables
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Joint Probability Mass Function
Let X and Y be two discrete rv’s defined on the
sample space of an experiment. The joint
probability mass function p(x, y) is defined for
each pair of numbers (x, y) by
p( x, y )  P( X  x and Y  y )
Let A be the set consisting of pairs of (x, y)
values, then
P  X , Y   A 
  p ( x, y )
 x, y  A
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Marginal Probability Mass Functions
The marginal probability mass
functions of X and Y, denoted pX(x) and
pY(y) are given by
p X ( x )   p ( x, y )
y
pY ( y )   p( x, y )
x
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Joint Probability Density Function
Let X and Y be continuous rv’s. Then f (x, y)
is a joint probability density function for X
and Y if for any two-dimensional set A
P  X , Y   A   f ( x, y )dxdy
A
If A is the two-dimensional rectangle
( x, y) : a  x  b, c  y  d ,
bd
P  X , Y   A    f ( x, y )dydx
ac
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
f ( x, y )
A = shaded
rectangle
P  X , Y   A
= Volume under density surface above A
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Marginal Probability Density Functions
The marginal probability density functions of X
and Y, denoted fX(x) and fY(y), are given by

f X ( x) 

f ( x, y )dy for    x  

f ( x, y )dx for    y  


fY ( y ) 

Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Independent Random Variables
Two random variables X and Y are said to be
independent if for every pair of x and y
values
p( x, y)  p X ( x)  pY ( y)
when X and Y are discrete or
f ( x, y)  f X ( x)  fY ( y)
when X and Y are continuous. If the
conditions are not satisfied for all (x, y) then
X and Y are dependent.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
More Than Two Random Variables
If X1, X2,…,Xn are all discrete random variables,
the joint pmf of the variables is the function
p( x1,..., xn )  P( X1  x1,...,X n  xn )
If the variables are continuous, the joint pdf is the
function f such that for any n intervals [a1,b1],
…,[an,bn], P(a1  X1  b1,...,an  X n  bn )
b1
bn
a1
an
  ...  f ( x1,..., xn )dxn ...dx1
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Independence – More Than Two
Random Variables
The random variables X1, X2,…,Xn are
independent if for every subset X i , X i ,..., X i
1
2
n
of the variables, the joint pmf or pdf of the
subset is equal to the product of the marginal
pmf’s or pdf’s.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Conditional Probability Function
Let X and Y be two continuous rv’s with joint pdf
f (x, y) and marginal X pdf fX(x). Then for any X
value x for which fX(x) > 0, the conditional
probability density function of Y given that X = x
is
f ( x, y )
fY | X ( y | x) 
f X ( x)
  y  
If X and Y are discrete, replacing pdf’s by pmf’s
gives the conditional probability mass function
of Y when X = x.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
5.2
Expected Values,
Covariance, and
Correlation
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Expected Value
Let X and Y be jointly distributed rv’s with pmf
p(x, y) or pdf f (x, y) according to whether the
variables are discrete or continuous. Then the
expected value of a function h(X, Y), denoted
E[h(X, Y)] or h( X ,Y )
is
 h( x, y )  p( x, y ) discrete
x y

 

continuous
h
(
x
,
y
)

f
(
x
,
y
)
dxdy
 

 
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Covariance
The covariance between two rv’s X and Y is
Cov  X , Y   E  X   X Y  Y  
 ( x   X )( y  Y ) p( x, y ) discrete
x y

 

( x   X )( y  Y ) f ( x, y )dxdy continuous


 

Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Short-cut Formula for Covariance
Cov  X , Y   E  XY    X  Y
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Correlation
The correlation coefficient of X and Y,
denoted by Corr(X, Y),  X ,Y , or just  , is
defined by
Cov  X , Y 
 X ,Y 
 X Y
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Correlation Proposition
1. If a and c are either both positive or both
negative, Corr(aX + b, cY + d) = Corr(X, Y)
2. For any two rv’s X and Y,
1  Corr( X , Y )  1.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Correlation Proposition
1. If X and Y are independent, then   0,
but   0 does not imply independence.
2.   1 or  1 iff Y  aX  b
for some numbers a and b with a  0.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
5.3
Statistics
and their
Distributions
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Statistic
A statistic is any quantity whose value can be
calculated from sample data. Prior to obtaining
data, there is uncertainty as to what value of any
particular statistic will result. A statistic is a
random variable denoted by an uppercase letter;
a lowercase letter is used to represent the
calculated or observed value of the statistic.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Random Samples
The rv’s X1,…,Xn are said to form a (simple
random sample of size n if
1. The Xi’s are independent rv’s.
2. Every Xi has the same probability
distribution.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Simulation Experiments
The following characteristics must be specified:
1. The statistic of interest.
2. The population distribution.
3. The sample size n.
4. The number of replications k.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
5.4
The Distribution
of the
Sample Mean
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Using the Sample Mean
Let X1,…, Xn be a random sample from a
distribution with mean value  and standard
deviation  . Then
 
2
2

2. V  X    X 
n
1. E X   X  
In addition, with To = X1 +…+ Xn,
2
E To   n , V To   n ,and To  n .
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Normal Population Distribution
Let X1,…, Xn be a random sample from a
normal distribution with mean value  and
standard deviation  . Then for any n, X
is normally distributed, as is To.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
The Central Limit Theorem
Let X1,…, Xn be a random sample from a
2

distribution with mean value and variance  .
Then if n sufficiently large, X has
approximately a normal distribution with
2
2

 X   and  X  n , and To also has
approximately a normal distribution with
2
To  n ,  To  n . The larger the value of
n, the better the approximation.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
The Central Limit Theorem
X small to
moderate n
X large n
Population
distribution

Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Rule of Thumb
If n > 30, the Central Limit Theorem can
be used.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Approximate Lognormal Distribution
Let X1,…, Xn be a random sample from a
distribution for which only positive values are
possible [P(Xi > 0) = 1]. Then if n is
sufficiently large, the product Y = X1X2…Xn has
approximately a lognormal distribution.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
5.5
The Distribution
of a
Linear Combination
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Linear Combination
Given a collection of n random variables
X1,…, Xn and n numerical constants a1,…,an,
the rv
n
Y  a1 X1  ...  an X n   ai X i
i 1
is called a linear combination of the Xi’s.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Expected Value of a Linear
Combination
Let X1,…, Xn have mean values 1, 2 ,..., n
and variances of 12 ,  22 ,...,  n2 , respectively
Whether or not the Xi’s are independent,
E  a1 X1  ...  an X n   a1E  X1   ...  an E  X n 
 a11  ...  an n
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Variance of a Linear Combination
If X1,…, Xn are independent,
V  a1 X1  ...  an X n   a12V  X1   ...  an2V  X n 
 a1212  ...  an2 n2
and
 a1 X1 ... an X n 
2 2
a1 1
2 2
 ...  an n
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Variance of a Linear Combination
For any X1,…, Xn,
n
n

V  a1 X1  ...  an X n    ai a j Cov X i , X j
i 1 j 1

Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Difference Between Two Random
Variables
E  X1  X 2   E  X1   E  X 2 
and, if X1 and X2 are independent,
V  X1  X 2   V  X1   V  X 2 
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Difference Between Normal Random
Variables
If X1, X2,…Xn are independent, normally
distributed rv’s, then any linear combination
of the Xi’s also has a normal distribution. The
difference X1 – X2 between two independent,
normally distributed variables is itself
normally distributed.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.