Download STATISTICAL LABORATORY, May 14th, 2010 EXPECTATIONS

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Actuary wikipedia , lookup

Central limit theorem wikipedia , lookup

Karhunen–Loève theorem wikipedia , lookup

Dragon King Theory wikipedia , lookup

Expected utility hypothesis wikipedia , lookup

Risk aversion (psychology) wikipedia , lookup

Law of large numbers wikipedia , lookup

Expected value wikipedia , lookup

Transcript
STATISTICAL LABORATORY, May 14th, 2010
EXPECTATIONS AND WEAK LAW OF LARGE
NUMBERS
Mario Romanazzi
1
EXPECTATION
Ex1 Find E(1/(X + 1)), where X is a Poisson random variable (Rice, 4.20).
Solution.
Note that E(1/(X +1)) 6= 1/E(X +1) = 1/(E(X)+1) (why?). The general theorem
on the expectation of a transformation must be used.
∞
X 1
λx
1
1
)=
exp(−λ) = .
E(
X +1
x+1
x!
λ
x=0
Ex2 A random square has a side length that is a uniform R(0, 1) random variable. Find
the expected area of the square (Rice, 4.21).
Solution.
1. Write L for the length of the side and A for the area of the square. Then
A = L2 and E(A) = E(L2 ). From the variance formula
V ar(L) = E(L2 ) − (E(L))2
it follows
1
1
1
+ ( )2 = .
12
2
3
2
2
Here, E(A) = E(L ) > (E(L)) , equivariance does not hold. Why?
E(L2 ) = V ar(L) + (E(L))2 =
2. Lengthy. First derive the probability distribution of A = L2 and then use the
definition of expectation. Note that for 0 ≤ a ≤ 1, the cdf of A is
√
√
FA (a) = P (L2 ≤ a) = P (L ≤ a) = a.
3. What is the median of A?
1
1 EXPECTATION
2
Ex3 A random rectangle has sides whose lengths are independent uniform random variables. Find the expected area of the rectangle (Rice, 4.22).
Solution.
1. Write L1 and L2 for the lengths of the sides and assume L1 ∼ R(0, a1 , L2 ∼
R(0, a2 ). The expected area of the rectangle is E(L1 L2 ). From the covariance
formula
Cov(L1 , L2 ) = E(L1 L2 ) − E(L1 )E(L2 )
it follows
1
E(L1 L2 ) = Cov(L1 , L2 ) + E(L1 )E(L2 ) = E(L1 )E(L2 ) = a1 a2 ,
4
because stochastic independence implies linear independence, that is, Cov(L1 , L2 ) =
0. If a1 = a2 = 1, E(L1 L2 ) = 1/4 < E(L2i ).
2. As previously, an alternative solution is to derive the probability distribution
of A = L1 L2 and then use the definition of expectation. Some care is needed
with integration limits.
2
, σY2 and σZ2 .
Ex4 Let X, Y and Z be uncorrelated random variables with variances σX
Let U = Z + X, V = Z + Y . Find Cov(U, V ), Corr(U, V ) (Rice, 4.54).
Solution. A little algebra.
Cov(U, V ) = E(U V ) − E(U )E(V )
= E((Z + X)(Z + Y )) − E(Z + X)E(Z + Y )
= E(Z 2 + ZY + XZ + XY ) − (E(Z) + E(X))(E(Z) + E(Y )),
etc.
Ex5 Suppose there are n securities, each with the same expected return, that all the
returns have the same standard deviations and that the returns are uncorrelated.
1. What is the optimal portfolio vector? 2. Plot the risk of the optimal portfolio
versus n. How does this risk compare to that incurred by putting all your money in
one security? (Rice, 4.51).
Solution.
1. We write Xi , i = 1, ..., n, for the returns of the n securities, and µ and σ for
the common expectation and standardP
deviation. We also write αP
i , i = 1, ..., n,
n
for the portfolio composition (αi ≥ 0, i=1 αi = 1) and X(α) = ni=1 αi Xi for
the corresponding return. Hence X(α) is a linear combination (more precisely,
1 EXPECTATION
3
a weighted mean) of uncorrelated random variables with the same expectation
and the same standard deviation. Using basic results of linear combinations
E(X(α)) =
V ar(X(α)) =
n
X
i=1
n
X
αi µ = µ
n
X
αi = µ,
i=1
αi2 σ 2
=σ
2
n
X
αi2 ,
i=1
i=1
hence the expectation of the portfolio does not depend on the weights αi ,
whereas the risk SD(X(α)) is a function of the weights
n
X
SD(X(α)) = σ(
αi2 )1/2 .
i=1
The optimal portfolio has minimum risk, or variance. It can be shown that
this result is only obtained by the equal-weight portfolio αi∗ = 1/n, i = 1, ..., n.
A
is obtained by looking for the
Pnminimum of the function g(α1 , ..., αn ) =
Pproof
n
2
i=1 αi = 1. The corresponding risk is
i=1 αi , subject to the constraint
p
SD(X(α∗ )) = σ/ (n)
Putting all the money on just one security (e. g., the i-th one), corresponds to
choosing αi = 1 and αj = 0, j 6= i. The corresponding risk is σ.
2. The plot is obtained through the following R functions.
> sigma <- 0.1
> plot(1:20, sigma/sqrt(1:20), type = "b", pch = 20, xlab = "Number of secu
+
ylab = "Minimum Risk")
Ex6 Consider two securities, the first having µ1 = 1 and σ1 = 0.1 and the second having
µ2 = 0.8 and σ2 = 0.12. Suppose that they are negatively correlated, with ρ = −0.8.
1) If you could only invest in one security, which one would you choose, and why?
2) Suppose you invest 50% of your money in each of the two. What is your expected
return and what is your risk? If you invest 80% of your money in security 1 and 20%
in security 2, what is your expected return and your risk? 3) Denote the expected
return and its standard deviation as functions of α by µ(α) and σ(α). The pair
(µ(α), σ(α)) trace out a curve in the plane as α varies from 0 to 1. Plot this curve.
(Rice, 4.52)
Solution.
1. According to the usual criterion, maximize expected return and minimize risk,
the first security should be chosen.
4
●
●
0.06
Minimum Risk
0.08
0.10
1 EXPECTATION
●
●
0.04
●
●
●
●
●
●
●
●
●
●
0.02
●
5
10
15
●
●
●
●
●
20
Number of securities
2. We write X1 and X2 for the random variables describing the returns from the
two securities. The total return is the linear combination X = 0.5X1 + 0.5X2 .
Therefore
E(X) = 0.5E(X1 ) + 0.5E(X2 ) = 0.9,
V ar(X) = 0.25V ar(X1 ) + 0.25V ar(X2 ) + 2 · 0.5 · 0.5Cov(X1 , X2 ) = 0.0013.
The risk is SD(X) = (0.0013)1/2 ' 0.0361. With portfolio proportions α1 =
0.8, α2 = 1 − 0.8 = 0.2, the results are
E(X) = 0.8E(X1 ) + 0.2E(X2 ) = 0.96,
V ar(X) = 0.64V ar(X1 ) + 0.04V ar(X2 ) + 2 · 0.8 · 0.2Cov(X1 , X2 ) = 0.003904,
therefore this choice produces an higher expected return as well as an higher
risk SD(X) ' 0.0628.
3. The R script below defines the functions µ(α) and σ(α) and plots them.
>
>
>
>
>
>
>
+
>
mu1 <- 1
sigma1 <- 0.1
mu2 <- 0.8
sigma2 <- 0.12
rho <- -0.8
mu <- function(alpha) alpha * mu1 + (1 - alpha) * mu2
var <- function(alpha) alpha^2 * sigma1^2 + (1 - alpha)^2 * sigma2^2 +
2 * alpha * (1 - alpha) * rho * sigma1 * sigma2
risk <- function(alpha) sqrt(var(alpha))
1 EXPECTATION
plot(risk, 0, 1, xlim = c(0, 1), ylim = c(0, 1.3), xlab = "Proportion of
ylab = "Expected Return, Risk", lty = "dashed", lwd = 2)
plot(mu, 0, 1, add = TRUE, lwd = 2)
legend(c(0, 0.35), c(1, 1.2), lty = c("solid", "dashed"), legend = c("Exp
"Risk"), cex = 0.8, lwd = 2)
mu_v <- sapply(seq(0, 1, 0.001), mu)
risk_v <- sapply(seq(0, 1, 0.001), risk)
plot(mu_v, risk_v, pch = 20, xlab = "Expected Return", ylab = "Risk")
points(c(mu(0.5), mu(0.8)), c(risk(0.5), risk(0.8)), col = c("red",
"blue"), pch = c("*", "+"), cex = 2)
rho <- 0
mu_v <- sapply(seq(0, 1, 0.001), mu)
risk_v <- sapply(seq(0, 1, 0.001), risk)
points(mu_v, risk_v, type = "l", pch = 20, lty = "dashed", lwd = 2)
points(c(mu(0.5), mu(0.8)), c(risk(0.5), risk(0.8)), col = c("red",
"blue"), pch = c("*", "+"), lty = "dashed", cex = 2)
legend(c(0.95, 1), c(0.11, 0.12), lty = c("solid", "dashed"),
legend = c("rho = -0.8", "rho = 0"), cex = 0.8, lwd = 2)
legend(c(0.85, 0.9), c(0.11, 0.12), pch = c("*", "+"), legend = c("alpha
"alpha = 0.8"), cex = 0.8)
1.2
0.12
>
+
>
>
+
>
>
>
>
+
>
>
>
>
>
+
>
+
>
+
5
0.06
0.08
Risk
0.8
0.6
0.4
0.0
0.04
0.2
Expected Return, Risk
1.0
0.10
Expected Return
Risk
0.0
0.2
0.4
0.6
Proportion of 1st Security, alpha
0.8
1.0
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
* alpha = 0.5
+ alpha = 0.8
rho = −0.8
rho = 0
+
*
+
*
0.80
0.85
0.90
0.95
1.00
Expected Return
The plot on the left shows the behaviour of µ(α) and σ(α) as functions of
α: µ(α) is increasing (why?) whereas σ(α) has a unique minimizer (where?).
The second plot adds information about the role of the linear correlation: for
example, ρ = 0 (corresponding to the dashed line) produces an higher risk than
ρ = −0.8for each value of α.
Ex7 A fair coin is tossed three times and the following random variables are observed:
X : number of heads in the first two tosses, Y : total number of heads. The joint
1 EXPECTATION
6
probability function is given below. 1) Derive the marginal distributions and their
expectations and standard deviations. 2) Are X and Y stochastically independent?
Linearly independent? 3) What is the joint cdf value at the point x = 1, y = 2?
What is the probability of the events X = Y , X > Y ? 4) Compute the conditional
distributions Y |X = 0, X|Y = 3.
Number of heads
first 2 tosses, X
0
1
2
Marginale, Y
Total number of heads, Y
0
1
2
3
1/8 1/8 0
0
0 2/8 2/8
0
0
0 1/8
1/8
Marginale, X
Solution.
1. Marginal distributions are derived by summing up the joint probabilities.
Number of heads
first 2 tosses, X
0
1
2
Marginale, Y
Total number of heads, Y
0
1
2
3
1/8 1/8 0
0
0 2/8 2/8
0
0
0 1/8
1/8
1/8 3/8 3/8
1/8
Marginale, X
2/8
4/8
2/8
1
Expectations are µX = 1, µY = 3/2, by symmetry (the results can also be
derived from the general formula np, because X and Y have a binomial distribution). Variances are
V ar(X) = E(X − µX )2 = E(X 2 ) − µ2X = 1/2 + 1 − 1 = 1/2,
V ar(Y ) = E(Y − µY )2 = E(Y 2 ) − µ2Y = 3/8 + 12/8 + 9/8 − 9/4 = 3/4.
Again, the results agree with the binomial formula np(1 − p).
2. X and Y are stochastically dependent. The result does not need any computation because six cells of the bivariate distribution have 0 probability which
obviously means that the factorization criterion fails here. To check linear independence the covariance is required (note that stochastic dependence does
not imply linear dependence).
Cov(X, Y ) ≡ σX,Y = E((X − µX )(Y − µY )) = E(XY ) − µX µY
= 2/8 + 4/8 + 4/8 + 6/8 − 3/2 = 1/2 > 0,
2 WEAK LAW OF LARGE NUMBERS
7
hence the variables are positively correlated. To better evaluate the strength
of the linear relationship we compute the correlation coefficient.
Corr(X, Y ) ≡ ρX,Y =
√
σX,Y
1/2
√
= √
= 6/3 ' 0.816.
σX σY
( 2/2)( 3/2)
3.
FX,Y (1, 2) = P (X ≤ 1 ∩ Y ≤ 2) = P (0, 0), (0, 1), (0, 2), (1, 0), (1, 1), (1, 2)
= 1/8 + 1/8 + 0 + 0 + 2/8 + 2/8 = 3/4.
P (X = Y ) = P (0, 0), (1, 1), (2, 2) = 1/8 + 2/8 + 1/8 = 1/2.
The event X > Y is impossible (check definition of X and Y ) and its probability
is zero.
4. The conditional distributions are obtained from the definition.
fX,Y (0, y)
,
fX (0)
fX,Y (x, 3)
.
fX|Y =3 (x)) =
fY (3)
fY |X=0 (y)) =
It is clear that X|Y = 3 is a degenerate distribution, with the probability
concentrated at x = 2. The conditional distributions Y |X = x are given
below.
Values of Y
fY |X=0
fY |X=1
fY |X=2
2
0
1
2
3
1/2 1/2 0
0
0
1/2 1/2 0
0
0
1/2 1/2
WEAK LAW OF LARGE NUMBERS
Ex1 Consider the sequence of random variables {Xn , n = 1, 2, ...}, where for each n Xn
is an exponential distribution with parameter λ = n. 1) Plot the density functions
of Xn for n = 1, 2, 3. 2) Prove that the sequence converges to zero, in probability.
3) Find n such that P (|Xn | ≥ 0.1) ' 0.01.
Solution.
1. The exponential density is monotone decreasing and its steepness grows higher
as the parameter λ = n increases. Hence, when n grows higher, the distribution
becomes more and more concentrated in a (right) neighbourhood of zero. The
density plot is obtained easily through R.
2 WEAK LAW OF LARGE NUMBERS
>
>
>
>
>
>
>
+
+
8
f1 <- function(x) dexp(x, rate = 1)
f2 <- function(x) dexp(x, rate = 2)
f3 <- function(x) dexp(x, rate = 3)
plot(f3, 0, 4, lwd = 2, xlab = "x", ylab = "PDF", main = "Exponential Den
plot(f2, 0, 4, add = TRUE, lty = "dashed", col = "red", lwd = 2)
plot(f1, 0, 4, add = TRUE, lty = "dotted", col = "green", lwd = 2)
legend(c(3, 4), c(2.4, 2.9), lty = c("solid", "dashed", "dotted"),
legend = c("lambda = 3", "lambda = 2", "lambda = 1"), cex = 0.8,
lwd = 2)
3.0
Exponential Densities
1.5
0.0
0.5
1.0
PDF
2.0
2.5
lambda = 3
lambda = 2
lambda = 1
0
1
2
3
4
x
2. To prove the stochastic convergence of the sequence to zero, we have to show
that for any positive number lim P (|Xn | ≥ ) = 0.
n→∞
Knowledge of the probability distribution of Xn allows us to compute an explicit
expression of the probability of the basic event, P (|Xn | ≥ ).
P (|Xn | ≥ ) = P (Xn ≥ ) = 1 − FXn () = exp(−n)
and the limit of the last expression when n → ∞ is zero, because of the
properties of the exponential function.
3.
P (|Xn | ≥ 0.1) = P (Xn ≥ 0.1) = 1 − FXn (0.1) = exp(−0.1n) = 0.01
⇔ n = −10 ln 0.01 ' 46.