Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
BIOINF 2118 N 05 - Expectation and Variance Page 1 of 6 “Expectation” is a “measure of central tendency”, one kind of “average”. Expectation = “mean” . Expectation = value of a bet, or a gamble (Rev.Thomas Bayes). Expectation = balance point. Other “averages” include: - the median (the 50% quantile, , for example qpois(0.50, 1)). o Try: plot(seq(1,10,by = 0.1), qpois(p=0.50, lambda = seq(1,10,by = 0.1)), xlab="mean", ylab="median") - the mode (the most common value). Definition of expectation: The expected value of a random variable X = E[X] or E(X) or EX = the mean of the distribution = the mean of X. “Units” of the mean = x-thingies (like years, kilometers, people, …) BIOINF 2118 N 05 - Expectation and Variance Page 2 of 6 For a discrete random variable (RV): The expected value is only defined if . (“Absolute convergence" ) For a continuous RV: where f is the density (p.d.f.) of X. The expected value is only defined if (“Absolute convergence “) (How can the mean NOT exist? The Cauchy distribution …) The expectation of a function r( ) is or so the mean is the expectation of the identity function r(x) = x. Let X and Y have a joint distribution with pmf or pdf equal to f( , ). If r is a function of X and Y, then the expected value of r(X,Y) is defined by or depending on whether X and Y are discrete or continuous. Absolute convergence is still required for the expected value to be defined. BIOINF 2118 N 05 - Expectation and Variance Page 3 of 6 The Laws of Large Numbers What’s so special about expectation? The distribution of the sample mean converges to the point distribution with a p.m.f. equal to 1 on E(X ) and zero everywhere else, X1 X2 X3 Average -----------------------------------------------> The variance, or "2nd central moment", of a RV X is defined as The variance measures the spread of a distribution. “Units” = square-x-thingies. Also, and . Example: If , then the mean of X is Exercises: If X 1,...,X n are i.i.d. with mean and the variance is and variance . , then what are mean and variance of X ? What are the mean and variance of binomial? Bernoulli? uniform? Poisson? The Standard Deviation is the square root of the variance. “Units” = x-thingies. The Coefficient of Variation is Standard Deviation / Mean. Scale-free!! No x-thingies. (The CV only makes sense if X is non-negative.) BIOINF 2118 N 05 - Expectation and Variance Page 4 of 6 Higher moments The kth central moment, of a RV X is defined as Any distribution that is symmetric around the mean has all odd central moments = 0. The skewness measures the lop-sidedness of a distribution. . Notice that it is scale-free. The kurtosis measures the lumpiness of a distribution. . Notice that it is also scale-free. If X is normal, kurtosis = 0. Covariance For two RV’s X and Y, the covariance between them is . Correlation To get scale-free measure of association, we define the correlation, . Scale-free! (“Dimensionless”) BIOINF 2118 N 05 - Expectation and Variance Page 5 of 6 Example of moment calculations Mean of a Poisson RV: ¥ E(X | l ) = å (x) e - l l x / x! x=0 ¥ = å (x) e - l l x / x! x=1 ¥ = å e - l l x / (x - 1)! x=1 = ¥ å e ll - y +1 / y! y +1=1 {Letting y = x - 1} ¥ = l å e-l l y / y ! y =0 =l ( ) Likewise, E(X (X - 1) | l ) = l 2 , so var(X ) = E(X 2 ) - E(X )2 = l 2 + l - l 2 = l . The mean equals the variance. So the Coefficient of Variation, var(X ) / E(X ) , equals 1/ l . Amazingly… and usefully… if l is not a fixed number, but instead drawn from a gamma distribution G(a , b ), then the marginal distribution of X (averaging over l ) is negative binomial, X ~ NB(a ,1/ (b + 1)) Pr(X = x) = dnegbin(x,size = alpha,p = 1/ (beta + 1)) This is over-dispersed compared to the Poisson, so useful for modeling count data where the variance is bigger than the mean. If you'd like to see the details, look at the file 'The negative binomial distribution.docx' on the web site. BIOINF 2118 N 05 - Expectation and Variance Page 6 of 6 Extra topic: Convolution The sum of two (or more) independent random variables is called a convolution. We've seen some already. The binomial variate is a sum of i.i.d. Bernoulli variates. The negative binomial variate is a sum of i.i.d. geometric variates. The sum of independent Poisson variates is Poisson. To get convolution distributions, you have to sum or integrate. For example: The Binomial is related to the Poisson distribution. Y If X~Poisson(a) and Y~Poisson(b), independent, then Z=4 1) Z X Y is Poisson(a+b). The distribution of Z is a convolution: z Pr(Z = z) = å Pr(X = x)Pr(Y = z - x) x=0 z = å Pr(X = x and Y = z - x) X x=0 z = å [X,Y ](x, z - x) x=0 2) X | X Y is Binomial( X Y , a /(a b) ) . To get these results, let c=a+b, n=X+Y, and p=a/(a+b). Rewrite the joint distribution: [X,Y ] = a X -a bY -b æ (X + Y )! a X bY ö æ (a + b) X+Y -(a+b) ö e e =ç e ÷ø X! Y! è X!Y ! (a + b) X+Y ÷ø çè (X + Y )! æ Z ö X =ç p (1- p)Y ÷ X è ø cZ - c e Z! Now we see that the three factorials in the binomial coefficient are the factorials in three different Poisson formulas. = [X | Z ][Z ].