Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
STATISTICAL LABORATORY, May 14th, 2010 EXPECTATIONS AND WEAK LAW OF LARGE NUMBERS Mario Romanazzi 1 EXPECTATION Ex1 Find E(1/(X + 1)), where X is a Poisson random variable (Rice, 4.20). Solution. Note that E(1/(X +1)) 6= 1/E(X +1) = 1/(E(X)+1) (why?). The general theorem on the expectation of a transformation must be used. ∞ X 1 λx 1 1 )= exp(−λ) = . E( X +1 x+1 x! λ x=0 Ex2 A random square has a side length that is a uniform R(0, 1) random variable. Find the expected area of the square (Rice, 4.21). Solution. 1. Write L for the length of the side and A for the area of the square. Then A = L2 and E(A) = E(L2 ). From the variance formula V ar(L) = E(L2 ) − (E(L))2 it follows 1 1 1 + ( )2 = . 12 2 3 2 2 Here, E(A) = E(L ) > (E(L)) , equivariance does not hold. Why? E(L2 ) = V ar(L) + (E(L))2 = 2. Lengthy. First derive the probability distribution of A = L2 and then use the definition of expectation. Note that for 0 ≤ a ≤ 1, the cdf of A is √ √ FA (a) = P (L2 ≤ a) = P (L ≤ a) = a. 3. What is the median of A? 1 1 EXPECTATION 2 Ex3 A random rectangle has sides whose lengths are independent uniform random variables. Find the expected area of the rectangle (Rice, 4.22). Solution. 1. Write L1 and L2 for the lengths of the sides and assume L1 ∼ R(0, a1 , L2 ∼ R(0, a2 ). The expected area of the rectangle is E(L1 L2 ). From the covariance formula Cov(L1 , L2 ) = E(L1 L2 ) − E(L1 )E(L2 ) it follows 1 E(L1 L2 ) = Cov(L1 , L2 ) + E(L1 )E(L2 ) = E(L1 )E(L2 ) = a1 a2 , 4 because stochastic independence implies linear independence, that is, Cov(L1 , L2 ) = 0. If a1 = a2 = 1, E(L1 L2 ) = 1/4 < E(L2i ). 2. As previously, an alternative solution is to derive the probability distribution of A = L1 L2 and then use the definition of expectation. Some care is needed with integration limits. 2 , σY2 and σZ2 . Ex4 Let X, Y and Z be uncorrelated random variables with variances σX Let U = Z + X, V = Z + Y . Find Cov(U, V ), Corr(U, V ) (Rice, 4.54). Solution. A little algebra. Cov(U, V ) = E(U V ) − E(U )E(V ) = E((Z + X)(Z + Y )) − E(Z + X)E(Z + Y ) = E(Z 2 + ZY + XZ + XY ) − (E(Z) + E(X))(E(Z) + E(Y )), etc. Ex5 Suppose there are n securities, each with the same expected return, that all the returns have the same standard deviations and that the returns are uncorrelated. 1. What is the optimal portfolio vector? 2. Plot the risk of the optimal portfolio versus n. How does this risk compare to that incurred by putting all your money in one security? (Rice, 4.51). Solution. 1. We write Xi , i = 1, ..., n, for the returns of the n securities, and µ and σ for the common expectation and standardP deviation. We also write αP i , i = 1, ..., n, n for the portfolio composition (αi ≥ 0, i=1 αi = 1) and X(α) = ni=1 αi Xi for the corresponding return. Hence X(α) is a linear combination (more precisely, 1 EXPECTATION 3 a weighted mean) of uncorrelated random variables with the same expectation and the same standard deviation. Using basic results of linear combinations E(X(α)) = V ar(X(α)) = n X i=1 n X αi µ = µ n X αi = µ, i=1 αi2 σ 2 =σ 2 n X αi2 , i=1 i=1 hence the expectation of the portfolio does not depend on the weights αi , whereas the risk SD(X(α)) is a function of the weights n X SD(X(α)) = σ( αi2 )1/2 . i=1 The optimal portfolio has minimum risk, or variance. It can be shown that this result is only obtained by the equal-weight portfolio αi∗ = 1/n, i = 1, ..., n. A is obtained by looking for the Pnminimum of the function g(α1 , ..., αn ) = Pproof n 2 i=1 αi = 1. The corresponding risk is i=1 αi , subject to the constraint p SD(X(α∗ )) = σ/ (n) Putting all the money on just one security (e. g., the i-th one), corresponds to choosing αi = 1 and αj = 0, j 6= i. The corresponding risk is σ. 2. The plot is obtained through the following R functions. > sigma <- 0.1 > plot(1:20, sigma/sqrt(1:20), type = "b", pch = 20, xlab = "Number of secu + ylab = "Minimum Risk") Ex6 Consider two securities, the first having µ1 = 1 and σ1 = 0.1 and the second having µ2 = 0.8 and σ2 = 0.12. Suppose that they are negatively correlated, with ρ = −0.8. 1) If you could only invest in one security, which one would you choose, and why? 2) Suppose you invest 50% of your money in each of the two. What is your expected return and what is your risk? If you invest 80% of your money in security 1 and 20% in security 2, what is your expected return and your risk? 3) Denote the expected return and its standard deviation as functions of α by µ(α) and σ(α). The pair (µ(α), σ(α)) trace out a curve in the plane as α varies from 0 to 1. Plot this curve. (Rice, 4.52) Solution. 1. According to the usual criterion, maximize expected return and minimize risk, the first security should be chosen. 4 ● ● 0.06 Minimum Risk 0.08 0.10 1 EXPECTATION ● ● 0.04 ● ● ● ● ● ● ● ● ● ● 0.02 ● 5 10 15 ● ● ● ● ● 20 Number of securities 2. We write X1 and X2 for the random variables describing the returns from the two securities. The total return is the linear combination X = 0.5X1 + 0.5X2 . Therefore E(X) = 0.5E(X1 ) + 0.5E(X2 ) = 0.9, V ar(X) = 0.25V ar(X1 ) + 0.25V ar(X2 ) + 2 · 0.5 · 0.5Cov(X1 , X2 ) = 0.0013. The risk is SD(X) = (0.0013)1/2 ' 0.0361. With portfolio proportions α1 = 0.8, α2 = 1 − 0.8 = 0.2, the results are E(X) = 0.8E(X1 ) + 0.2E(X2 ) = 0.96, V ar(X) = 0.64V ar(X1 ) + 0.04V ar(X2 ) + 2 · 0.8 · 0.2Cov(X1 , X2 ) = 0.003904, therefore this choice produces an higher expected return as well as an higher risk SD(X) ' 0.0628. 3. The R script below defines the functions µ(α) and σ(α) and plots them. > > > > > > > + > mu1 <- 1 sigma1 <- 0.1 mu2 <- 0.8 sigma2 <- 0.12 rho <- -0.8 mu <- function(alpha) alpha * mu1 + (1 - alpha) * mu2 var <- function(alpha) alpha^2 * sigma1^2 + (1 - alpha)^2 * sigma2^2 + 2 * alpha * (1 - alpha) * rho * sigma1 * sigma2 risk <- function(alpha) sqrt(var(alpha)) 1 EXPECTATION plot(risk, 0, 1, xlim = c(0, 1), ylim = c(0, 1.3), xlab = "Proportion of ylab = "Expected Return, Risk", lty = "dashed", lwd = 2) plot(mu, 0, 1, add = TRUE, lwd = 2) legend(c(0, 0.35), c(1, 1.2), lty = c("solid", "dashed"), legend = c("Exp "Risk"), cex = 0.8, lwd = 2) mu_v <- sapply(seq(0, 1, 0.001), mu) risk_v <- sapply(seq(0, 1, 0.001), risk) plot(mu_v, risk_v, pch = 20, xlab = "Expected Return", ylab = "Risk") points(c(mu(0.5), mu(0.8)), c(risk(0.5), risk(0.8)), col = c("red", "blue"), pch = c("*", "+"), cex = 2) rho <- 0 mu_v <- sapply(seq(0, 1, 0.001), mu) risk_v <- sapply(seq(0, 1, 0.001), risk) points(mu_v, risk_v, type = "l", pch = 20, lty = "dashed", lwd = 2) points(c(mu(0.5), mu(0.8)), c(risk(0.5), risk(0.8)), col = c("red", "blue"), pch = c("*", "+"), lty = "dashed", cex = 2) legend(c(0.95, 1), c(0.11, 0.12), lty = c("solid", "dashed"), legend = c("rho = -0.8", "rho = 0"), cex = 0.8, lwd = 2) legend(c(0.85, 0.9), c(0.11, 0.12), pch = c("*", "+"), legend = c("alpha "alpha = 0.8"), cex = 0.8) 1.2 0.12 > + > > + > > > > + > > > > > + > + > + 5 0.06 0.08 Risk 0.8 0.6 0.4 0.0 0.04 0.2 Expected Return, Risk 1.0 0.10 Expected Return Risk 0.0 0.2 0.4 0.6 Proportion of 1st Security, alpha 0.8 1.0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● * alpha = 0.5 + alpha = 0.8 rho = −0.8 rho = 0 + * + * 0.80 0.85 0.90 0.95 1.00 Expected Return The plot on the left shows the behaviour of µ(α) and σ(α) as functions of α: µ(α) is increasing (why?) whereas σ(α) has a unique minimizer (where?). The second plot adds information about the role of the linear correlation: for example, ρ = 0 (corresponding to the dashed line) produces an higher risk than ρ = −0.8for each value of α. Ex7 A fair coin is tossed three times and the following random variables are observed: X : number of heads in the first two tosses, Y : total number of heads. The joint 1 EXPECTATION 6 probability function is given below. 1) Derive the marginal distributions and their expectations and standard deviations. 2) Are X and Y stochastically independent? Linearly independent? 3) What is the joint cdf value at the point x = 1, y = 2? What is the probability of the events X = Y , X > Y ? 4) Compute the conditional distributions Y |X = 0, X|Y = 3. Number of heads first 2 tosses, X 0 1 2 Marginale, Y Total number of heads, Y 0 1 2 3 1/8 1/8 0 0 0 2/8 2/8 0 0 0 1/8 1/8 Marginale, X Solution. 1. Marginal distributions are derived by summing up the joint probabilities. Number of heads first 2 tosses, X 0 1 2 Marginale, Y Total number of heads, Y 0 1 2 3 1/8 1/8 0 0 0 2/8 2/8 0 0 0 1/8 1/8 1/8 3/8 3/8 1/8 Marginale, X 2/8 4/8 2/8 1 Expectations are µX = 1, µY = 3/2, by symmetry (the results can also be derived from the general formula np, because X and Y have a binomial distribution). Variances are V ar(X) = E(X − µX )2 = E(X 2 ) − µ2X = 1/2 + 1 − 1 = 1/2, V ar(Y ) = E(Y − µY )2 = E(Y 2 ) − µ2Y = 3/8 + 12/8 + 9/8 − 9/4 = 3/4. Again, the results agree with the binomial formula np(1 − p). 2. X and Y are stochastically dependent. The result does not need any computation because six cells of the bivariate distribution have 0 probability which obviously means that the factorization criterion fails here. To check linear independence the covariance is required (note that stochastic dependence does not imply linear dependence). Cov(X, Y ) ≡ σX,Y = E((X − µX )(Y − µY )) = E(XY ) − µX µY = 2/8 + 4/8 + 4/8 + 6/8 − 3/2 = 1/2 > 0, 2 WEAK LAW OF LARGE NUMBERS 7 hence the variables are positively correlated. To better evaluate the strength of the linear relationship we compute the correlation coefficient. Corr(X, Y ) ≡ ρX,Y = √ σX,Y 1/2 √ = √ = 6/3 ' 0.816. σX σY ( 2/2)( 3/2) 3. FX,Y (1, 2) = P (X ≤ 1 ∩ Y ≤ 2) = P (0, 0), (0, 1), (0, 2), (1, 0), (1, 1), (1, 2) = 1/8 + 1/8 + 0 + 0 + 2/8 + 2/8 = 3/4. P (X = Y ) = P (0, 0), (1, 1), (2, 2) = 1/8 + 2/8 + 1/8 = 1/2. The event X > Y is impossible (check definition of X and Y ) and its probability is zero. 4. The conditional distributions are obtained from the definition. fX,Y (0, y) , fX (0) fX,Y (x, 3) . fX|Y =3 (x)) = fY (3) fY |X=0 (y)) = It is clear that X|Y = 3 is a degenerate distribution, with the probability concentrated at x = 2. The conditional distributions Y |X = x are given below. Values of Y fY |X=0 fY |X=1 fY |X=2 2 0 1 2 3 1/2 1/2 0 0 0 1/2 1/2 0 0 0 1/2 1/2 WEAK LAW OF LARGE NUMBERS Ex1 Consider the sequence of random variables {Xn , n = 1, 2, ...}, where for each n Xn is an exponential distribution with parameter λ = n. 1) Plot the density functions of Xn for n = 1, 2, 3. 2) Prove that the sequence converges to zero, in probability. 3) Find n such that P (|Xn | ≥ 0.1) ' 0.01. Solution. 1. The exponential density is monotone decreasing and its steepness grows higher as the parameter λ = n increases. Hence, when n grows higher, the distribution becomes more and more concentrated in a (right) neighbourhood of zero. The density plot is obtained easily through R. 2 WEAK LAW OF LARGE NUMBERS > > > > > > > + + 8 f1 <- function(x) dexp(x, rate = 1) f2 <- function(x) dexp(x, rate = 2) f3 <- function(x) dexp(x, rate = 3) plot(f3, 0, 4, lwd = 2, xlab = "x", ylab = "PDF", main = "Exponential Den plot(f2, 0, 4, add = TRUE, lty = "dashed", col = "red", lwd = 2) plot(f1, 0, 4, add = TRUE, lty = "dotted", col = "green", lwd = 2) legend(c(3, 4), c(2.4, 2.9), lty = c("solid", "dashed", "dotted"), legend = c("lambda = 3", "lambda = 2", "lambda = 1"), cex = 0.8, lwd = 2) 3.0 Exponential Densities 1.5 0.0 0.5 1.0 PDF 2.0 2.5 lambda = 3 lambda = 2 lambda = 1 0 1 2 3 4 x 2. To prove the stochastic convergence of the sequence to zero, we have to show that for any positive number lim P (|Xn | ≥ ) = 0. n→∞ Knowledge of the probability distribution of Xn allows us to compute an explicit expression of the probability of the basic event, P (|Xn | ≥ ). P (|Xn | ≥ ) = P (Xn ≥ ) = 1 − FXn () = exp(−n) and the limit of the last expression when n → ∞ is zero, because of the properties of the exponential function. 3. P (|Xn | ≥ 0.1) = P (Xn ≥ 0.1) = 1 − FXn (0.1) = exp(−0.1n) = 0.01 ⇔ n = −10 ln 0.01 ' 46.