Download N05-Expectation and Variance

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Degrees of freedom (statistics) wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Eigenstate thermalization hypothesis wikipedia , lookup

Law of large numbers wikipedia , lookup

Taylor's law wikipedia , lookup

Transcript
BIOINF 2118
N 05 - Expectation and Variance
Page 1 of 6
“Expectation” is a “measure of central tendency”, one kind of “average”.
Expectation = “mean” .
Expectation = value of a bet, or a gamble (Rev.Thomas Bayes).
Expectation = balance point.
Other “averages” include:
- the median (the 50% quantile,
, for example qpois(0.50, 1)).
o Try: plot(seq(1,10,by = 0.1), qpois(p=0.50, lambda = seq(1,10,by = 0.1)),
xlab="mean", ylab="median")
- the mode (the most common value).
Definition of expectation:
The expected value of a random variable X
= E[X] or E(X) or EX
= the mean of the distribution
= the mean of X.
“Units” of the mean = x-thingies (like years, kilometers, people, …)
BIOINF 2118
N 05 - Expectation and Variance
Page 2 of 6
For a discrete random variable (RV):
The expected value is only defined if
. (“Absolute convergence" )
For a continuous RV:
where f is the density (p.d.f.) of X. The expected value is only defined if
(“Absolute convergence “)
(How can the mean NOT exist?
The Cauchy distribution …)
The expectation of a function r( ) is
or
so the mean is the expectation of the identity function r(x) = x.
Let X and Y have a joint distribution with pmf or pdf equal to f( , ). If r is a function of X
and Y, then the expected value of r(X,Y) is defined by
or
depending on whether X and Y are discrete or continuous. Absolute convergence is
still required for the expected value to be defined.
BIOINF 2118
N 05 - Expectation and Variance
Page 3 of 6
The Laws of Large Numbers
What’s so special about expectation?
The distribution of the sample mean
converges to the point distribution
with a p.m.f. equal to 1 on E(X ) and zero everywhere else,
X1
X2
X3
Average ----------------------------------------------->
The variance, or "2nd central moment", of a RV X is defined as
The variance measures the spread of a distribution. “Units” = square-x-thingies.
Also,
and
.
Example: If
, then the mean of X is
Exercises: If X 1,...,X n are i.i.d. with mean
and the variance is
and variance
.
,
then what are mean and variance of X ?
What are the mean and variance of binomial? Bernoulli? uniform? Poisson?
The Standard Deviation is the square root of the variance. “Units” = x-thingies.
The Coefficient of Variation is Standard Deviation / Mean. Scale-free!! No x-thingies.
(The CV only makes sense if X is non-negative.)
BIOINF 2118
N 05 - Expectation and Variance
Page 4 of 6
Higher moments
The kth central moment, of a RV X is defined as
Any distribution that is symmetric around the mean has all odd central moments = 0.
The skewness measures the lop-sidedness of a distribution.
.
Notice that it is scale-free.
The kurtosis measures the lumpiness of a distribution.
.
Notice that it is also scale-free. If X is normal, kurtosis = 0.
Covariance
For two RV’s X and Y, the covariance between them is
.
Correlation
To get scale-free measure of association, we define the correlation,
.
Scale-free! (“Dimensionless”)
BIOINF 2118
N 05 - Expectation and Variance
Page 5 of 6
Example of moment calculations
Mean of a Poisson RV:
¥
E(X | l ) = å (x) e - l l x / x!
x=0
¥
= å (x) e - l l x / x!
x=1
¥
= å e - l l x / (x - 1)!
x=1
=
¥
å e ll
-
y +1
/ y!
y +1=1
{Letting y = x - 1}
¥
= l å e-l l y / y !
y =0
=l
(
)
Likewise, E(X (X - 1) | l ) = l 2 , so var(X ) = E(X 2 ) - E(X )2 = l 2 + l - l 2 = l .
The mean equals the variance.
So the Coefficient of Variation,
var(X ) / E(X ) , equals 1/ l .
Amazingly… and usefully… if l is not a fixed number, but instead drawn from a
gamma distribution G(a , b ), then the marginal distribution of X (averaging over
l ) is negative binomial,
X ~ NB(a ,1/ (b + 1))
Pr(X = x) = dnegbin(x,size = alpha,p = 1/ (beta + 1))
This is over-dispersed compared to the Poisson, so useful for modeling count
data where the variance is bigger than the mean.
If you'd like to see the details, look at the file
'The negative binomial distribution.docx'
on the web site.
BIOINF 2118
N 05 - Expectation and Variance
Page 6 of 6
Extra topic: Convolution
The sum of two (or more) independent random variables is called a convolution.
We've seen some already.
The binomial variate is a sum of i.i.d. Bernoulli variates.
The negative binomial variate is a sum of i.i.d. geometric variates.
The sum of independent Poisson variates is Poisson.
To get convolution distributions, you have to sum or integrate.
For example:
The Binomial is related to the Poisson distribution.
Y
If X~Poisson(a) and Y~Poisson(b), independent, then
Z=4
1) Z  X  Y is Poisson(a+b). The distribution of Z is a convolution:
z
Pr(Z = z) = å Pr(X = x)Pr(Y = z - x)
x=0
z
= å Pr(X = x and Y = z - x)
X
x=0
z
= å [X,Y ](x, z - x)
x=0
2) X | X  Y is Binomial( X  Y , a /(a  b) ) .
To get these results, let c=a+b, n=X+Y, and p=a/(a+b). Rewrite the joint distribution:
[X,Y ] =
a X -a bY -b æ (X + Y )! a X bY ö æ (a + b) X+Y -(a+b) ö
e
e =ç
e
÷ø
X!
Y!
è X!Y ! (a + b) X+Y ÷ø çè (X + Y )!
æ Z ö X
=ç
p (1- p)Y
÷
X
è
ø
cZ - c
e
Z!
Now we see that the three factorials in the binomial coefficient
are the factorials in three different Poisson formulas.
= [X | Z ][Z ].