Download Laboration 4: The Law of Large Numbers,The Central Limit Theorem

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Degrees of freedom (statistics) wikipedia , lookup

History of statistics wikipedia , lookup

Inductive probability wikipedia , lookup

Probability amplitude wikipedia , lookup

German tank problem wikipedia , lookup

Central limit theorem wikipedia , lookup

Law of large numbers wikipedia , lookup

Transcript
L UNDS
M ATEMATIKCENTRUM
TEKNISKA HÖGSKOLA
M ATEMATISK
D ATORLABORATION 4
STATISTIK , AK FÖR I, FMS
M ATEMATISK
STATISTIK
120, HT-00
Laboration 4:
The Law of Large Numbers,The Central Limit Theorem, and simple point
estimates
The goal of this laboration is to make you more familiar with the following important areas in mathematical statistics:
• The Law of Large Numbers
• The Central Limit Theorem
• Point estimates
1 Preparatory exercises
As a preparation before the laboration you should read through Chapters 7 and 8 (Blom: Bok A) and
Chapter 20 (Blom: Bok B) or, if you use a different book, corresponding chapters in that book. These
should cover Expectation (Ch. 2.4–5 in Ross), Covariance and Variance of Sums of Random Variables
(Ch. 2.7 in Ross), Chebychev’s Inequality and the Weak Law of Large Numbers (Ch. 2.9 in Ross), Normal
Random Variables and the Central Limit Theorem (Ch. 3.6 in Ross), Maximum Likelihood Estimators
(Ch. 5.3 in Ross), and Least Squares Estimators (Ch. 5.1? in Ross). You should also read the entire laboration assignment and repeat the appendix of assignment 3 about Wiener processes and Geometrical
Brownian Motion.
At the start of the laboration you will have brought the solutions to the exercises (a)–(e):
(a) Give a short explanation of “The Law of Large Numbers”.
(b) Give a short explanation of “The Central Limit Theorem”.
(c) Let X be the number of pips on a thrown dice with p X (k) = 1/6 for k = 1, 2, 3, 4, 5, 6.
Which distribution does the sum of n independent throws approximately have when n is large?
(d) We have observations x1 , x2 , . . . , xn which are independent and Exp (a)-distributed. Derive the Maximum Likelihood- and Least Squares-estimate (minsta kvadrat) of a.
(e) How does one estimate the expectation and standard deviation of a normal distribution, with the
help of a random sample x1 , . . . , xn ?
2 The Law of Large Numbers
The Law of Large Numbers says that if X̄n is the mean of n identically distributed independent random
variables X1 , . . . , Xn with finite variance, then
P(|X̄n − mX | > ) → 0
when n → ∞
for all > 0, which can also be expressed as X̄n → mX in probability. In other words, the mean of n
variables will differ less and less from the expectation when n grows. One way to illustrate this is to throw
a dice many times and see that the successive means converge to the expectation.
First we simulate 100 throws of a dice:
Datorlaboration 4, Matstat AK för I, HT-00
>> help unidrnd
>> X=unidrnd(6,100,1)
One way to calculate the successive means is the following:
>> Xbar=cumsum(X)./(1:100)’
The function cumsum gives a vector where element i is the sum of the i first elements in X. The notation ./
means elementwise division and (1:100)’ is a column vector with the numbers 1 through 100. Convince
yourself that Xbar contains the successive means. Plot them:
>> plot(1:100,Xbar)
Repeat the whole thing using more dice throws, e.g. 1000. Does everything look like you would expect?
Answer: . . .
3 The Central Limit Theorem
Start by inventing a discrete probability function with some possible outcomes, e.g. the uniform distribution over 1 through 6, i.e. a dice throw. Then put this probability function into a vector.
>> p=[0 1 1 1 1 1 1]/6
The zero is there because things get easier if the first element in the vector is the probability that the
outcome is 0. Please feel free to choose some other probability function if you like. Plot the probability
function with the command bar.
>> bar(0:length(p)-1,p)
The function length gives the length of a vector.
As you know, the probability function of a sum of two independent variables can be calculated using
discrete convolution (faltning). In M ATLAB there is a function, conv, that performes such a convolution.
>> p2=conv(p,p)
>> p4=conv(p2,p2)
>> p8=conv(p4,p4)
Here p8 is the probability function for the sum of 8 independent random variables with probability function p. Plot these new probability functions. When does it start to look like a normal distribution?
Answer: . . .
Now, calculate the expectation and standard deviation for a random variable with probability function p:
>> m=sum((0:6).*p)
>> sigma=sqrt(sum(((0:6)-m).^2.*p))
The function sum givs the sum of the elements in a vector, the notation .^2 means elementwise squares
of a vector, and sqrt is the square root. We can now compare the probability function p4 with the
√
n) (where n = 4) that we derive using the Central Limit
approximate normal distribution N (nm,
Theorem.
>>
>>
>>
>>
>>
bar(0:length(p4)-1,p4)
hold on
xx=0:0.5:30;
plot(xx,normpdf(xx,4*m,sqrt(4)*sigma))
hold off
2
Datorlaboration 4, Matstat AK för I, HT-00
The command hold on causes further plots to be drawn in the same picture as the old ones. Is p4 well
approximated by the normal distribution?
Answer: . . .
Also examine what happens if p has a very scewed distribution, e.g.
>> p=[0 1 0 0 0 0 5]/6
How many components are now needed in the sum in order for the distribution to be well approximated
by a normal distribution?
Answer: . . .
4 Point estimates
4.1 ML- och LS-estimates
We will now take a closer look at two of the most common estimation methods in statistics, namely
Maximum Likelihood (ML) and Least Squares (LS or MK, minsta kvadrat). We will, i.a., see that the MLestimation is a maximization problem while the LS-estimation can be seen as a minimization problem.
In the file matdata.dat we have 150 measurements of the lifetime (unit: hours) of a certain component
in a car. The lifetime for each component is assumed to be independent of all other components. Load the
data and do a first examination of the lifetimes.
>> load matdata.dat
>> plot(matdata,’*’)
>> hist(matdata)
We are interested in estimating the mean lifetime of the components. One variant of doing this is to
calculate the ML-estimate of a. In order to do this we need to have some idea about what distribution
the data follow. From earlier experiments it has been shown that the lifetime of a certain component is
approximately exponentially distributed. Hence, we assume that the lifetime is exponentially distributed
and write down the log-likelihood function. What does it look like?
Answer: l(a) = ln L(a) = . . .
There is a specially written m-file, ML_exp, that calculates l(a). Study its M ATLAB-commands and make
sure that it really gives the right function! (type ML_exp).
Plot l(a) for 30 ≤ a ≤ 150.
>>
>>
>>
>>
a=[30:.5:150];
l=ML_exp(a,matdata);
plot(a,l)
grid
What does the function look like and which value of a corresponds to the ML-estimate? (You can use the
command zoom to enlarge parts of the plot.)
Answer: . . .
We will now look at an LS-estimate of the mean lifetime. The advantage of the LS method compared to
the ML method is that with LS the distribution of the data does not have to be known. Start by writing
down the loss-function, Q(a).
Answer: Q(a) = . . .
3
Datorlaboration 4, Matstat AK för I, HT-00
The program MK_exp is specially written to calculate Q(a). Look at the M ATLAB-commands to check that
it is correct! Plot Q(a).
>> Q=MK_exp(a,matdata);
>> plot(a,Q)
>> grid
Which value of a corresponds to the LS-estimate?
Answer: . . .
∗
Both the ML- and the LS-estimates of a are simple to calculate, see preparatoty exercise (d). Calculate a ML
∗
and aLS and compare with your plots.
Answer: . . .
Here the ML- and the LS-estimates are equal; this is not always the case.
4.2 The estimate a∗ is a random variable!
If we were to get 150 new measurements of the lifetime of the above type of components (i.e. a new
sample) the estimate of the expectation would be sure to be different, i.e., the estimate can be seen as a
random variable. To illustrate this we imagine that we take 1000 samples with 150 measurements each.
Since we don’t have 1000 real samples we have to settle for simulated data. By using the function exprnd
we can easily generate exponentially distributed random numbers. We suppose that the true expected value
(the value that in practice is known only by God) is 100, i.e. a = 100:
>> help exprnd
>> a=100;
>> x=exprnd(a,150,1000);
Column i in the matrix x corresponds to sample i. We will now estimate a for each sample. This can be
done by
>> a_est=mean(x);
Element i in the vector a_est contains the estimate of the expected value for sample i. Plot a_est! What
does it look like?
Answer: . . .
Which approximate distribution will the estimate of the expectation follow? Use the commands hist
and normplot and your newly aquired knowledge of the Law of Large Numbers and the Central Limit
Theorem to answer this.
Answer: . . .
5 Estimation of the volatility
The file fondkurs.mat contains the stock-exchange rates of five mutual funds noted once a week starting
in December 1997. Data lies in X, the names of the funds in namn. We will estimate the parameters in the
√
2
Geometrical Brownian Motions X (t) = x 0 e( − /2)t+W (t) where W (t) ∈ N (0,
t) (see the appendix of
lab 3). We will, in particular, be interested in the volatility, i.e. . Start by looking at the material:
>> load fondkurs
>> plot(X)
4
Datorlaboration 4, Matstat AK för I, HT-00
We will concentrate on the logarithm of the relative exchange rate, Y (t) = ln(X (t)/X (0)), which we can
calculate by dividing every element in X by the corresponding element in a matrix where each row is a
repetition of the first one:
>> Y=log(X./(ones(length(X),1)*X(1,:)))
>> plot(Y)
According to the model Y (t) = ( − 2 /2)t + W (t), i.e., a linear trend plus normally distributed noice.
By calculating successive differences Z t = Y (t) − Y (t − 1) we find that Zt = − 2 /2 + W (t) − W (t − 1)
should be independent and N ( − 2 /2, )-distributed.
>>
>>
>>
>>
Z=diff(Y)
plot(Z)
hist(Z)
normplot(Z)
Do the Z :s look like they are normally distributed?
Answer: . . .
If the stock-exchange had been stabile, i.e. varied by the same amount the whole time, without, e.g. the
stock-exchange chrisis in the autumn of 1998, we could have estimated the trend − 2 /2 using the mean
and the volatility using the standard deviation of the Z t :s. Do this using mean and std:
Answer: . . .
In the plots it looks like the volatility isn’t constant over time. In order to see how it varies we can subtract
the mean from Zt and square it, i.e., calculate the contribution every point in time gives to the variance
(the square of the distance from the mean), and plot the successive sums:
>> Z2=(Z-ones(length(Z),1)*mean(Z)).^2
>> plot(cumsum(Z2))
The slope of these curves gives the square of the volatility, i.e. the variation in exchange rate at each point
in time. In the plot the stock-exchange chrisis is very visible.
What does it look like after the chrisis, i.e. after 40 weeks? Is the volatility constant then?
Answer: . . .
Which fund varied most during the last six months?
Answer: . . .
How can we, from the plot, get the standard deviation of Z t that we calculated earlier?
Answer: . . .
5