Download 2. The Sample Mean and the Law of Large Numbers

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
Transcript
The Sample Mean and the Law of Large Numbers
Page 1 of 4
Virtual Laboratories > Random Samples > 1 [2] 3 4 5 6 7 8 9
2. The Sample Mean and the Law of Large Numbers
The Sample Mean
As usual, we start with a random experiment that has a sample space and a probability measure P.
Suppose that X is a real-valued random variable. We will denote the mean and standard deviation of
X by µ and σ respectively.
Now suppose we perform independent replications of the basic experiment. This defines a new,
compound experiment with a sequence of independent random variables, each with the same
distribution as X:
X = (X1, X2, ...)
Recall that in statistical terms, (X1, X2, ..., Xn) is a random sample of size n from the distribution of X
for each n. The sample mean is simply the average of the variables in the sample:
Mn ≡ Mn(X) = (X1 + X2 + ··· + Xn) / n.
The sample mean is a real-valued function of the random sample and thus is a statistic. Like any
statistic, the sample mean is itself a random variable with a distribution, mean, and variance of its
own. Many times, the distribution mean is unknown and the sample mean is used as an estimator of
the distribution mean.
1. In the dice experiment, select the average random variable. For each die distribution, start with
n = 1 die and increase the number of dice by one until you get to n = 20 dice. Note the shape and
location of the density function at each stage. With 20 dice, run the simulation 1000 times with an
update frequency of 10. Note the apparent convergence of the empirical density function to the true
density function.
Properties of the Sample Mean
2. Show that E(Mn) = µ.
Exercise 1 shows that Mn is an unbiased estimator of µ. Therefore, the variance of the sample mean
is the mean square error, when the sample mean is used as an estimator of the distribution mean.
3. Show that var(Mn) = σ2 / n.
From Exercise 3, the variance of the sample mean is an increasing function of the distribution
variance and a decreasing function of the sample size. Both of these make intuitive sense if we think
of the sample mean as an estimator of the distribution mean.
4. In the dice experiment, select the average random variable. For each die distribution, start with
n = 1 die and increase the number of dice by one until you get to n = 20 dice. Note that the mean of
the sample mean stays the same, but the standard deviation of the sample mean decreases (as we now
know, in inverse proportion to the square root of the sample size). Run the simulation 1000 times,
updating every 10 runs. Note the apparent convergence of the empirical moments of the sample
http://www.math.uah.edu/statold/sample/sample2.html
2003-12-05
The Sample Mean and the Law of Large Numbers
Page 2 of 4
mean to the true moments.
5. Compute the sample mean of the petal width variable for the following cases in Fisher's iris
data. Compare the results.
a.
b.
c.
d.
All cases
Setosa only
Versicolor only
Verginica only
The Weak Law of Large Numbers
By Exercise 3, note that var(Mn) → 0 as n → ∞. This means that Mn → µ as n → ∞ in mean
square.
6. Use Chebyshev's inequality to show that
P[|Mn − µ| > r] → 0 as n → ∞ for any r > 0.
This result is known as the weak law of large numbers, and states that the sample mean converges
to the mean of the distribution in probability. Recall that in general, convergence in mean square
implies convergence in probability.
The Strong Law of Large Numbers
The strong law of large numbers states that the sample mean Mn converges to the distribution
mean µ with probability 1:
P(Mn → µ as n → ∞) = 1.
As the name suggests, this is a much stronger result than the weak law. We will construct a fairly
simple proof under the assumption that the 4'th central moment is finite:
b4 = E[(X − µ)4] < ∞.
However, there are better proofs that do not need this assumption−−see for example, the book
Probability and Measure by Patrick Billingsley.
7. Let Yi = Xi − µ. and let Wn = Y1 + Y2 + ··· + Yn. Show that
a. Y1, Y2, ..., Yn are independent and identically distributed.
b. E(Yi) = 0.
c. E(Yi2) = σ2.
d. E(Yi4) = b4.
e. Mn → µ as n → ∞ if and only if Wn / n → 0 as n → ∞.
By Exercise 7, we want to show that with probability 1, Wn / n → 0 as n → ∞.
8. Show that Wn / n does not converge to 0 if and only if there exists a rational number r > 0 such
http://www.math.uah.edu/statold/sample/sample2.html
2003-12-05
The Sample Mean and the Law of Large Numbers
Page 3 of 4
that |Wn / n| > r for infinitely many n.
Thus, we need to show that the event described in Exercise 8 has probability 0.
9. Show that Wn4 is the sum of YiYjYkYl over all i, j, k, l in {1, 2, ..., n}.
10. Show that
a. E(YiYjYkYl) = 0 if one index differs from the other three.
b. E(Yi2Yj2) = σ4 if i and j are distinct, and there are 3n(n − 1) such terms in E(Wn4).
c. E(Yi4) = b4 and there are n such terms in E(Wn4).
11. Use the results in Exercise 10 to show that E(Sn4) ≤ Cn2 for some constant C (independent of
n).
12. Use Markov's inequality and the result of Exercise 11 to show that for r > 0,
P(|Wn / n| > r) = P(Wn4 > r4n4) ≤ C / (r4n2).
13. Use the first Borel-Cantelli lemma to show that
P(|Wn / n| > r for infinitely many n) = 0.
14. Finally, show that
P(there exists rational r > 0 such that |Wn / n| > r for infinitely many n) = 0.
Simulation Exercises
15. In the dice experiment, select the average random variable. For each die distribution, start with
n = 1 die and increase the number of dice by one until you get to n = 20 dice. Note how the
distribution of the sample mean begins to resemble a point mass distribution. Run the simulation
1000 times, updating every 10 runs. Note the apparent convergence of the empirical density of the
sample mean to the true density.
Many of the applets in this project are simulations of experiments with a basic random variable of
interest. When you run the simulation, you are performing independent replications of the
experiment. In most cases, the applet displays the mean of the distribution numerically in a table and
graphically as the center of the blue horizontal bar in the graph box. When you run the simulation,
sample mean is also displayed numerically in the table and graphically as the center of the red
horizontal bar in the graph box.
16. In the simulation of the binomial coin experiment, the random variable is the number of heads.
Run the simulation 1000 times updating every 10 runs and note the apparent convergence of the
sample mean to the distribution mean.
17. In the simulation of the matching experiment, the random variable is the number of matches.
Run the simulation 1000 times updating every 10 runs and note the apparent convergence of the
sample mean to the distribution mean.
http://www.math.uah.edu/statold/sample/sample2.html
2003-12-05
The Sample Mean and the Law of Large Numbers
Page 4 of 4
18. Run the simulation of the exponential experiment 1000 times with an update frequency of 10.
Note the apparent convergence of the sample mean to the distribution mean.
Virtual Laboratories > Random Samples > 1 [2] 3 4 5 6 7 8 9
Contents | Applets | Data Sets | Biographies | Resources | Keywords | Search | Comments | ©
http://www.math.uah.edu/statold/sample/sample2.html
2003-12-05