Download Sample Mean: Expected Value and Variance

Document related concepts
Transcript
Section 10.1
Sample Mean: Expected Value
and Variance
Definition 10.1 Sample Mean
For iid random variables X1, . . . , Xn with PDF fX(x), the sample mean of
X is the random variable
X + · · · + Xn
.
Mn(X) = 1
n
Sample Mean 6= Expected Value
• The first thing to notice is that Mn (X) is a function of the random variables
X1 , . . . , Xn and is therefore a random variable itself.
• It is important to distinguish the sample mean, Mn (X), from E[X], which we sometimes refer to as the mean value of random variable X.
• While Mn (X) is a random variable, E[X] is a number.
• To avoid confusion when studying the sample mean, it is advisable to refer to E[X]
as the expected value of X, rather than the mean of X.
• The sample mean of X and the expected value of X are closely related.
• A major purpose of this chapter is to explore the fact that as n increases without
bound, Mn (X) predictably approaches E[X].
• In everyday conversation, this phenomenon is often called the law of averages.
Theorem 10.1
The sample mean Mn(X) has expected value and variance
E [Mn(X)] = E [X ] ,
Var[X]
Var[Mn(X)] =
.
n
Proof: Theorem 10.1
From Definition 10.1, Theorem 9.1, and the fact that E[Xi ] = E[X] for all i,
1
1
(10.1)
(E [X1 ] + · · · + E [Xn ]) = (E [X] + · · · + E [X]) = E [X] .
n
n
Because Var[aY ] = a2 Var[Y ] for any random variable Y (Theorem 3.15), Var[Mn (X)] =
Var[X1 + · · · + Xn ]/n2 . Since the Xi are iid, we can use Theorem 9.3 to show
E [Mn (X)] =
Var[X1 + · · · + Xn ] = Var[X1 ] + · · · + Var[Xn ] = n Var[X].
Thus Var[Mn (X)] = n Var[X]/n2 = Var[X]/n.
(10.2)
Comment: Theorem 10.1
• Theorem 10.1 demonstrates that as n increases without bound, the variance of
Mn (X) goes to zero.
• When we first met the variance, and its square root the standard deviation, we said
that they indicate how far a random variable is likely to be from its expected value.
• Theorem 10.1 suggests that as n approaches infinity, it becomes highly likely that
Mn (X) is arbitrarily close to its expected value, E[X].
• In other words, the sample mean Mn (X) converges to the expected value E[X] as
the number of samples n goes to infinity.
Quiz 10.1
X is the exponential (1) random variable; Mn(X) is the sample mean of n
independent samples of X. How many samples n are needed to guarantee
that the variance of the sample mean Mn(X) is no more than 0.01?
Quiz 10.1 Solution
An exponential random variable with expected value 1 also has variance
1. By Theorem 10.1, Mn(X) has variance Var[Mn(X)] = 1/n. Hence, we
need n = 100 samples.
Section 10.2
Deviation of a Random Variable
from the Expected Value
Theorem 10.2
Markov Inequality
For a random variable X, such that P[X < 0] = 0, and a constant c,
i
E [X ]
2
P X≥c ≤
.
c2
h
Proof: Theorem 10.2
Since X is nonnegative, fX(x) = 0 for x < 0 and
E [X ] =
Z c2
0
xfX (x) dx +
Z ∞
c2
xfX (x) dx ≥
Z ∞
c2
xfX (x) dx.
(10.3)
Since x ≥ c2 in the remaining integral,
E [ X ] ≥ c2
Z ∞
c2
h
i
2
2
fX (x) dx = c P X ≥ c .
(10.4)
Example 10.1
Let X represent the height (in feet) of a storm surge following a hurricane.
If the expected height is E[X] = 5.5, then the Markov inequality states
that an upper bound on the probability of a storm surge at least 11 feet
high is
P [X ≥ 11] ≤ 5.5/11 = 1/2.
(10.5)
Example 10.2
Suppose random variable Y takes on the value c2 with probability p and
the value 0 otherwise. In this case, E[Y ] = pc2, and the Markov inequality
states
h
i
2
P Y ≥ c ≤ E [Y ] /c2 = p.
(10.6)
Since P[Y ≥ c2] = p, we observe that the Markov inequality is in fact an
equality in this instance.
Theorem 10.3
Chebyshev Inequality
For an arbitrary random variable Y and constant c > 0,
Var[Y ]
P [|Y − µY | ≥ c] ≤
.
c2
Proof: Theorem 10.3
In the Markov inequality, Theorem 10.2, let X = (Y −µY )2. The inequality
states
h
i
2
h
i
h
i
E (Y − µY )
Var[Y ]
2
2
2
P X ≥ c = P (Y − µY ) ≥ c ≤
=
.
2
2
c
c
(10.7)
The theorem follows from the fact that {(Y −µY )2 ≥ c2} = {|Y − µY | ≥ c}.
Example 10.3 Problem
If the height X of a storm surge following a hurricane has expected value
E[X] = 5.5 feet and standard deviation σX = 1 foot, use the Chebyshev
inequality to to find an upper bound on P[X ≥ 11].
Example 10.3 Solution
Since a height X is nonnegative, the probability that X ≥ 11 can be
written as
P [X ≥ 11] = P [X − µX ≥ 11 − µX ] = P [|X − µX | ≥ 5.5] .
(10.8)
Now we use the Chebyshev inequality to obtain
P [X ≥ 11] = P [|X − µX | ≥ 5.5] ≤ Var[X]/(5.5)2 = 0.033 ≈ 1/30. (10.9)
Although this bound is better than the Markov bound, it is also loose.
P[X ≥ 11] is seven orders of magnitude lower than 1/30.
Theorem 10.4
Chernoff Bound
For an arbitrary random variable X and a constant c,
P [X ≥ c] ≤ min e−scφX (s).
s≥0
Proof: Theorem 10.4
In terms of the unit step function, u(x), we observe that
Z ∞
Z ∞
P [X ≥ c] =
fX (x) dx =
u(x − c)fX (x) dx.
−∞
c
For all s ≥ 0, u(x − c) ≤ es(x−c) . This implies
Z ∞
Z
P [X ≥ c] ≤
es(x−c) fX (x) dx = e−sc
−∞
(10.10)
∞
esx fX (x) dx = e−sc φX (s).
(10.11)
−∞
This inequality is true for any s ≥ 0. Hence the upper bound must hold when we choose
s to minimize e−sc φX (s).
Example 10.4 Problem
If the probability model of the height X, measured in feet, of a storm
surge following a hurricane at a certain location is the Gaussian (5.5,
1) random variable, use the Chernoff bound to find an upper bound on
P[X ≥ 11].
Example 10.4 Solution
In Table 9.1 the MGF of X is
2
φX (s) = e(11s+s )/2.
(10.12)
Thus the Chernoff bound is
2
2
P [X ≥ 11] ≤ min e−11se(11s+s )/2 = min e(s −11s)/2.
s≥0
s≥0
(10.13)
To find the minimizing s, it is sufficient to choose s to minimize h(s) =
s2 − 11s. Setting the derivative dh(s)/ds = 2s − 11 = 0 yields s = 5.5.
Applying s = 5.5 to the bound yields
(s2 −11s)/2 P [X ≥ 11] ≤ e
2 /2
−(5.5)
=e
= 2.7 × 10−7.
s=5.5
(10.14)
Quiz 10.2
In a subway station, there are exactly enough customers on the platform
to fill three trains. The arrival time of the nth train is X1 + · · · + Xn where
X1, X2, . . . are iid exponential random variables with E[Xi] = 2 minutes.
Let W equal the time required to serve the waiting customers. For
P[W > 20], the probability that W is over twenty minutes,
(a) Use the central limit theorem to find an estimate.
(b) Use the Markov inequality to find an upper bound.
(c) Use the Chebyshev inequality to find an upper bound.
(d) Use the Chernoff bound to find an upper bound.
(e) Use Theorem 4.11 for an exact calculation.
Quiz 10.2 Solution
The train interarrival times X1 , X2 , X3 are iid exponential (λ) random variables. The
arrival time of the third train is
W = X1 + X2 + X3 .
(1)
In Theorem 9.9, we found that the sum of three iid exponential (λ) random variables is
an Erlang (n = 3, λ) random variable. From Appendix A, we find that W has expected
value and variance
E [W ] = 3/λ = 6,
(2)
Var[W ] = 3/λ2 = 12.
(3)
(a) By the Central Limit Theorem,
W −6
20 − 6
P [W > 20] = P √
> √
12
12
7
≈Q √
= 2.66 × 10−5 .
3
(b) From the Markov inequality, we know that
P [W > 20] ≤
E [W ]
6
=
= 0.3.
20
20
(4)
[Continued]
Quiz 10.2 Solution
(Continued 2)
(c) To use the Chebyshev inequality, we observe that E[W ] = 6 and W nonnegative
imply
P [|W − E [W ]| ≥ 14] = P [W − 6 ≥ 14] + P [W − 6 ≤ −14]
|
{z
}
=0
= P [W ≥ 20] .
(5)
Thus
P [W ≥ 20] = P [|W − E [W ]| ≥ 14]
3
Var[W ]
=
= 0.061.
≤
142
49
(d) For the Chernoff bound, we note that the MGF of W is
3
λ
1
φW (s) =
=
.
λ−s
(1 − 2s)3
(6)
(7)
(8)
The Chernoff bound states that
P [W > 20] ≤ min e
s≥0
−20s
e−20s
φX (s) = min
.
s≥0 (1 − 2s)3
(9)
[Continued]
Quiz 10.2 Solution
(Continued 3)
To minimize h(s) = e−20s /(1 − 2s)3 , we set the derivative of h(s) to zero:
dh(s)
e−20s (40s − 14)
= 0.
=
4
ds
(1 − 2s)
(10)
This implies s = 7/20. Applying s = 7/20 into the Chernoff bound yields
e−20s P [W > 20] ≤
= 0.0338.
(1 − 2s)3 s= 7
20
(c) Theorem 4.11 says that for any w > 0, the CDF of the Erlang (3, λ) random variable
W satisfies
FW (w) = 1 −
2
X
(λw)k e−λw
k=0
k!
(11)
Equivalently, for λ = 1/2 and w = 20,
P [W > 20] = 1 − FW (20)
2
10
10
= e−10 1 +
+
1!
2!
= 61e−10 = 0.0028.
(12)
Although the Chernoff bound is weak in that it overestimates the probability by a
factor of 12, it is a valid bound. By contrast, the Central Limit Theorem approximation grossly underestimates the true probability.
Section 10.3
Laws of Large Numbers
Weak Law of Large
Theorem 10.5
Numbers (Finite Samples)
For any constant c > 0,
(a) P[|Mn(X) − µX | ≥ c] ≤
Var[X]
,
2
nc
Var[X]
(b) P[|Mn(X) − µX | < c] ≥ 1 −
.
nc2
Proof: Theorem 10.5
Let Y = Mn(X). Theorem 10.1 states that
E [Y ] = E [Mn(X)] = µX
Var[Y ] = Var[Mn(X)] = Var[X]/n.
(10.15)
Theorem 10.5(a) follows by applying the Chebyshev inequality (Theorem 10.3) to Y = Mn(X). Theorem 10.5(b) is just a restatement of
Theorem 10.5(a), since
P [|Mn(X) − µX | ≥ c] = 1 − P [|Mn(X) − µX | < c] .
(10.16)
Weak Law of Large
Theorem 10.6
Numbers (Infinite Samples)
If X has finite variance, then for any constant c > 0,
(a) lim P[|Mn(X) − µX | ≥ c] = 0,
n→∞
(b) lim P[|Mn(X) − µX | < c] = 1.
n→∞
The Weak Law
• In words, Theorem 10.6(b) says that the probability that the sample mean is within
±c units of E[X] goes to one as the number of samples approaches infinity.
• Since c can be arbitrarily small (e.g., 10−2000 ), both Theorem 10.5(a) and Theorem 10.6(b) can be interpreted as saying that the sample mean converges to E[X]
as the number of samples increases without bound.
• The weak law of large numbers is a very general result because it holds for all
random variables X with finite variance.
• Moreover, we do not need to know any of the parameters, such as the expected
value or variance, of random variable X.
Definition 10.2 Convergence in Probability
The random sequence Yn converges in probability to a constant y if for
any > 0,
lim P [|Yn − y| ≥ ] = 0.
n→∞
Theorem 10.7
As n → ∞, the relative frequency P̂n(A) converges to P[A]; for any constant c > 0,
h
i
lim P P̂n(A) − P [A] ≥ c = 0.
n→∞
Proof: Theorem 10.7
The proof follows from Theorem 10.6 since P̂n(A) = Mn(XA) is the
sample mean of the indicator XA, which has expected value E[XA] = P[A]
and variance Var[XA] = P[A](1 − P[A]).
Quiz 10.3
X1, . . . , Xn are n iid samples of the Bernoulli (p = 0.8) random variable
X.
(a) Find E[X] and Var[X].
(b) What is Var[M100(X)]?
(c) Use Theorem 10.5 to find α such that
P [|M100(X) − p| ≥ 0.05] ≤ α.
(d) How many samples n are needed to guarantee
P [|Mn(X) − p| ≥ 0.1] ≤ 0.05.
Quiz 10.3 Solution
(a) Since X is a Bernoulli random variable with parameter p = 0.8, we can look up in
Appendix A to find that E[X] = p = 0.8 and variance
Var[X] = p(1 − p) = (0.8)(0.2) = 0.16.
(1)
(b) By Theorem 10.1,
Var[X]
= 0.0016.
(2)
100
(c) Theorem 10.5 uses the Chebyshev inequality to show that the sample mean satisfies
Var[M100 (X)] =
Var[X]
.
(3)
nc2
Note that E[X] = PX(1) = p. To meet the specified requirement, we choose
c = 0.05 and n = 100. Since Var[X] = 0.16, we must have
P [|Mn (X) − E [X]| ≥ c] ≤
0.16
=α
100(0.05)2
(4)
This reduces to α = 16/25 = 0.64.
(d) Again we use Equation (3). To meet the specified requirement, we choose c = 0.1.
Since Var[X] = 0.16, we must have
0.16
≤ 0.05
n(0.1)2
The smallest value that meets the requirement is n = 320.
(5)
Section 10.4
Point Estimates of Model
Parameters
Model Parameters
• The general problem is estimation of a parameter of a probability model.
• A parameter is any number that can be calculated from the probability model.
• For example, for an arbitrary event A, P[A] is a model parameter.
Estimates of Model Parameters
• Consider an experiment that produces observations of sample values of the random
variable X.
• The observations are sample values of the random variables X1 , X2 , . . ., all with the
same probability model as X.
• Assume that r is a parameter of the probability model.
• We use the observations X1 , X2 , . . . to produce a sequence of estimates of r.
• The estimates R̂1 , R̂2 , . . . are all random variables.
• R̂1 is a function of X1 .
• R̂2 is a function of X1 and X2 , and in general R̂n is a function of X1 , X2 , . . . , Xn .
Definition 10.3 Consistent Estimator
The sequence of estimates R̂1, R̂2, . . . of parameter r is consistent if for
any > 0,
h
i
lim P R̂n − r ≥ = 0.
n→∞
Definition 10.4 Unbiased Estimator
An estimate, R̂, of parameter r is unbiased if E[R̂] = r; otherwise, R̂ is
biased.
Asymptotically Unbiased
Definition 10.5 Estimator
The sequence of estimators R̂n of parameter r is asymptotically unbiased
if
lim E[R̂n] = r.
n→∞
Definition 10.6 Mean Square Error
The mean square error of estimator R̂ of parameter r is
i
2
e = E (R̂ − r) .
h
Theorem 10.8
If a sequence of unbiased estimates R̂1, R̂2, . . . of parameter r has mean
square error en = Var[R̂n] satisfying limn→∞ en = 0, then the sequence
R̂n is consistent.
Proof: Theorem 10.8
Since E[R̂n] = r, we apply the Chebyshev inequality to R̂n.
constant > 0,
h
i
Var[R̂n]
P R̂n − r ≥ ≤
.
2
For any
(10.20)
In the limit of large n, we have
h
i
Var[R̂n]
lim P R̂n − r ≥ ≤ lim
= 0.
n→∞
n→∞
2
(10.21)
Example 10.5 Problem
In any interval of k seconds, the number Nk of packets passing through
an Internet router is a Poisson random variable with expected value
E[Nk ] = kr packets. Let R̂k = Nk /k denote an estimate of the parameter r packets/second. Is each estimate R̂k an unbiased estimate of
r? What is the mean square error ek of the estimate R̂k ? Is the sequence
of estimates R̂1, R̂2, . . . consistent?
Example 10.5 Solution
First, we observe that R̂k is an unbiased estimator since
E[R̂k ] = E [Nk /k] = E [Nk ] /k = r.
(10.22)
Next, we recall that since Nk is Poisson, Var[Nk ] = kr. This implies
Nk
Var [Nk ]
r
Var[R̂k ] = Var
=
.
=
2
k
k
k
(10.23)
Because R̂k is unbiased, the mean square error of the estimate is the
same as its variance: ek = r/k. In addition, since limk→∞ Var[R̂k ] = 0,
the sequence of estimators R̂k is consistent by Theorem 10.8.
Theorem 10.9
The sample mean Mn(X) is an unbiased estimate of E[X].
Theorem 10.10
The sample mean estimator Mn(X) has mean square error
i
Var[X]
2
en = E (Mn(X) − E [X ]) = Var[Mn(X)] =
.
h
n
Standard Error
√
• In the terminology of statistical inference, en , the standard deviation of the sample
mean, is referred to as the standard error of the estimate.
• The standard error gives an indication of how far we should expect the sample
mean to deviate from the expected value.
• In particular, when X is a Gaussian random variable (and Mn (X) is also Gaussian),
Problem 10.4.1 asks you to show that
√
√ P E [X] − en ≤ Mn (X) ≤ E [X] + en = 2Φ(1) − 1 ≈ 0.68.
(10.24)
In words, Equation (10.24) says there is roughly a two-thirds probability that the
sample mean is within one standard error of the expected value.
• This same conclusion is approximately true when n is large and the central limit
theorem says that Mn (X) is approximately Gaussian.
Example 10.6 Problem
How many independent trials n are needed to guarantee that P̂n(A), the
relative frequency estimate of P[A], has standard error ≤ 0.1?
Example 10.6 Solution
Since the indicator XA has variance Var[XA] = P[A](1 − P[A]), Theorem 10.10 implies that the mean square error of Mn(XA) is
Var [X ]
P [A] (1 − P [A])
=
.
(10.25)
en =
n
n
√
We need to choose n large enough to guarantee en ≤ 0.1 (en ≤= 0.01)
even though we don’t know P[A]. We use the fact that p(1 − p) ≤ 0.25
for all 0 ≤ p ≤ 1. Thus, en ≤ 0.25/n. To guarantee en ≤ 0.01, we choose
n = 0.25/0.01 = 25 trials.
Theorem 10.11
If X has finite variance, then the sample mean Mn(X) is a sequence of
consistent estimates of E[X].
Proof: Theorem 10.11
By Theorem 10.10, the mean square error of Mn(X) satisfies
Var[X]
lim Var[Mn(X)] = lim
= 0.
n→∞
n→∞
n
By Theorem 10.8, the sequence Mn(X) is consistent.
(10.26)
Estimating the Variance
• When E[X] is a known quantity µX , we know Var[X] = E[(X − µX )2 ].
• In this case, we can use the sample mean of W = (X − µX )2 to estimate Var[X].,
n
1X
Mn (W ) =
(Xi − µX )2 .
n i=1
(10.28)
If Var[W ] exists, Mn (W ) is a consistent, unbiased estimate of Var[X].
• When the expected value µX is unknown, the situation is more complicated because
the variance of X depends on µX .
• We cannot use Equation (10.28) if µX is unknown.
• In this case, we replace the expected value µX by the sample mean Mn (X).
Definition 10.7 Sample Variance
The sample variance of n independent observations of random variable
X is
n
1 X
Vn(X) =
(Xi − Mn(X))2 .
n i=1
Theorem 10.12
E [Vn(X)] =
n−1
Var[X].
n
Proof: Theorem 10.12
Substituting Definition 10.1 of the sample mean Mn (X) into Definition 10.7 of sample
variance and expanding the sums, we derive
n
n
n
1X 2
1 XX
Vn =
Xi Xj .
X − 2
n i=1 i
n i=1 j=1
Because the Xi are iid, E[Xi2 ] = E[X 2 ] for all i, and E[Xi ] E[Xj ] = µ2X .
rem 5.16(a),
E [Xi Xj ] = Cov [Xi , Xj ] + E [Xi ] E [Xj ] = Cov [Xi , Xj ] + µ2X .
(10.29)
By Theo(10.30)
Combining these facts, the expected value of Vn in Equation (10.29) is
n
n
2
1 XX
E [Vn ] = E X − 2
Cov [Xi , Xj ] + µ2X
n i=1 j=1
n
n
1 XX
Cov [Xi , Xj ] .
(10.31)
= Var[X] − 2
n i=1 j=1
P
P
Since the double sum has n2 terms, ni=1 nj=1 µ2X = n2 µ2X . Of the n2 covariance terms,
there are n terms of the form Cov[Xi , Xi ] = Var[X], while the remaining covariance
terms are all 0 because Xi and Xj are independent for i 6= j. This implies
E [Vn ] = Var[X] −
1
n−1
Var[X])
=
Var[X].
(n
n2
n
(10.32)
Theorem 10.13
The estimate
Vn0 (X) =
n
1 X
(Xi − Mn(X))2
n − 1 i=1
is an unbiased estimate of Var[X].
Proof: Theorem 10.13
Using Definition 10.7, we have
Vn0 (X) =
n
Vn(X),
n−1
(10.34)
and
i
0
E Vn(X) =
h
n
E [Vn(X)] = Var[X].
n−1
(10.35)
Quiz 10.4
X is the continuous uniform (−1, 1) random variable. Find the mean
square error, E[(Var[X] − V100(X))2], of the sample variance estimate of
Var[X], based on 100 independent observations of X.
Quiz 10.4 Solution
Define the random variable W = (X − µX )2. Observe that V100(X) =
M100(W ). By Theorem 10.10, the mean square error is
i
Var[W ]
2
E (M100(W ) − µW ) =
.
h
100
(1)
Observe that µX = 0 so that W = X 2. Thus,
h
i
µW = E X 2 =
h
i
h
i
2
4
E W =E X =
Z 1
−1
Z 1
−1
x2fX (x) dx = 1/3,
(2)
x4fX (x) dx = 1/5.
(3)
2
Therefore Var[W ] = E[W 2] − µ2
W = 1/5 − (1/3) = 4/45 and the mean
square error is 4/4500 = 0.0009.
Section 10.5
Matlab
Sample Mean Traces
• In particular, for a random
variable X,
0 we can view a set of iid samples X1, . . . , Xn
as a random vector X = X1 · · · Xn .
• This vector of iid samples yields a vector of sample mean values
0
M(X) = M1(X) M2(X) · · · Mn(X)
where
Mk (X) =
X1 + · · · + Xk
k
(10.36)
• We call a graph of the sequence Mk (X) versus k a sample mean trace.
• By graphing the sample mean trace as a function of n we can observe the convergence of the point estimate Mk (X) to E[X].
Example 10.7 Problem
Write a Matlab function bernoullitraces(n,m,p) to generate m sample
mean traces, each of length n, for the sample mean of a Bernoulli (p)
random variable.
Example 10.7 Solution
function MN=bernoullitraces(n,m,p);
x=reshape(bernoullirv(p,m*n),n,m);
nn=(1:n)’*ones(1,m);
MN=cumsum(x)./nn;
stderr=sqrt(p*(1-p))./sqrt((1:n)’);
plot(1:n,0.5+stderr,...
1:n,0.5-stderr,1:n,MN);
In bernoullitraces, each column of x is an
instance of a random vector X with iid Bernoulli
(p) components. Similarly, each column of MN
is an instance of the vector M(X).
The output graphs each column of MN as a function of the number of trials n. In
√
√
√
addition, we calculate the standard error ek and overlay graphs of p − ek and p + ek .
Equation (10.24) says that at each step k, we should expect to see roughly two-thirds
of the sample mean traces in the range
√
√
p − ek ≤ Mk (X) ≤ p + ek .
(10.37)
A sample graph of bernoullitraces(50,40,0.5) is shown in Figure 10.1. The figure
shows how at any given step, approximately two thirds of the sample mean traces are
within one standard error of the expected value.
Figure 10.1
1
Mn(X)
0.5
0
0
5
10
15
20
25
30
35
40
45
50
n
Sample output of bernoullitraces.m, including the deterministic standard
error graphs. The graph shows how at any given step, about two thirds
of the sample means are within one standard error of the true mean.
Quiz 10.5
Generate m = 1000 traces (each of length n = 100) of the sample mean
of a Bernoulli (p) random variable. At each step k, calculate Mk and
the number of traces, such that Mk is within one standard error of the
expected value p. Graph Tk = Mk /m as a function of k. Explain your
results.
Quiz 10.5 Solution
Following the bernoullitraces.m approach, we generate m = 1000 sample paths, each
sample path having n = 100 Bernoulli traces. at time k, OK(k) counts the fraction
of sample paths that have sample mean within one standard error of p. The program
bernoullisample.m generates graphs the number of traces within one standard error as
a function of the time, i.e. the number of trials in each trace.
function OK=bernoullisample(n,m,p);
x=reshape(bernoullirv(p,m*n),n,m);
nn=(1:n)’*ones(1,m);
MN=cumsum(x)./nn;
stderr=sqrt(p*(1-p))./sqrt((1:n)’);
stderrmat=stderr*ones(1,m);
OK=sum(abs(MN-p)<stderrmat,2)/m;
plot(1:n,OK);
[Continued]
Quiz 10.5 Solution
(Continued 2)
The following graph was generated by bernoullisample(50,5000,0.5):
1
0.5
0
0
10
20
30
40
50
As we would expect, as m gets large, the fraction of traces within one standard error approaches 2Φ(1) − 1 ≈ 0.68. The unusual sawtooth pattern, though perhaps unexpected,
is examined in Problem 10.5.1.