Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
ANALYSIS of SIMULATED DATA
Sample Mean and Sample Variance
• Assume iid RVs X1, X2, . . . , Xn as estimates of some quantity
of interest are produced by n runs of some simulation.
P
• Sample mean X̄ = n1 N
Xi is used estimate population
i=1
P
mean θ because E[X̄] = n1 ni=1 E[Xi] = n1 nθ = θ.
• Question: how good is X̄? Answer: try to use V ar(X̄):
V ar(X̄) = E[(θ − X̄)2]
n
1X
= V ar(
Xi )
n i=1
n
1 X
V ar(Xi))
= 2
n i=1
1 2 σ2
= 2 nσ = ,
n
n
and use Central Limit Theorem:
σ
P {|X̄ − θ| > c √ } ≈ P {|Z| > c} = 2(1 − Φ(c))
n
Eg. if c = 1.96, 2(1 − Φ(c)) ≈ .05,
if c = 2.58, 2(1 − Φ(c)) ≈ .01.
Problem: σ 2 is not known
1
SIMULATED DATA ANALYSIS CONT.
• Solution: estimate σ 2 using the sample
variance;
P
Start with σ 2 = E[(X − θ)2] ≈ n1 ni=1(Xi − X̄)2?
For statistical validity, check
n
n
X
X
E[ (Xi − X̄)2] = E[
Xi2] − nE[X̄ 2]
i=1
i=1
nE[X 2] − nE[X̄ 2]
n(V ar(X) + E[X]2) − n(V ar(X̄) + E[X̄]2)
n(V ar(X) + E[X]2) − n(σ 2/n + θ2)
(n − 1)σ 2
Pn
1
2
So define sample variance S = n−1 i=1(Xi − X̄)2,
q
Pn
1
sample standard deviation S = n−1 i=1(Xi − X̄)2 ≈ σ,
√
and sample standard error S/ n.
• Simulation Stopping
Method: given α and error tolerance δ,
P
j
1
2
let Sj2 = j−1
i=1 (Xi − X̄j ) .
√
Algorithm: sample Xi for i = 1, . . . , k until cSk / k < δ.
=
=
=
=
√
If cSk > δ k, when should stop occur?
Stop should occur at k ∗ ≈ c2Sk2/δ 2.
Example a) if c = 2, δ = .01, Sk = .1, k ∗ =?
2
SIMULATED DATA ANALYSIS CONT.
Example b)
N = 100;
for i= 1:N
X(i) = repair(4,3,1,2); % Crash time
end, disp([mean(X) std(X)])
1.5652
1.1066
if c = 2, δ = .01, k ∗ =?
for N = 50000, X̄ ≈ 1.5545, S ≈ 1.070;
another run: for N = 50000, X̄ ≈ 1.5494, S ≈ 1.078;
3
SIMULATED DATA ANALYSIS CONT.
Pj
1
2
• Computation of S : X̄j = j i=1 Xi.
Pj
1
2
Sj = j−1 ( i=1 Xi2 − j X̄j2) can be numerically unstable.
Consider iterative computation of Sj :
starting with X0 = S0 = S1 = 0, then
1
X̄j+1 = X̄j +
(Xj+1 − X̄j ),
j+1
j−1 2
2
)Sj + (j + 1)(X̄j+1 − X̄j )2.
Sj+1
=(
j
Also
2
Sj+1
j − 1 Sj2
=(
) + (X̄j+1 − X̄j )2.
j+1
j+1 j
4
SIMULATED DATA ANALYSIS CONT.
Interval Estimates for the Population Mean
√
• Assumption: for large n, n(X̄ − θ)/S ∼ N ormal(0, 1).
Given α, 0 < α < 1, let α = P {Z > zα},
with Z ∼ N ormal(0, 1), so zα = Φ−1(α), then
1 − α = P {−zα/2 < Z < zα/2}
√ X̄ − θ
1 − α = P {−zα/2 < n
< zα/2}
σ
√ X̄ − θ
< zα/2}
1 − α ≈ P {−zα/2 < n
S
S
S
1 − α ≈ P {X̄ − zα/2 √ < θ < X̄ + zα/2 √ }
n
n
S
S
E.g. 1 − .05 ≈ P {X̄ − 1.96 √ < θ < X̄ + 1.96 √ }
n
n
• Confidence Interval: given the sample mean X̄, and
the sample standard deviation S, the interval
S
S
Cα = X̄ − zα/2 √ , X̄ + zα/2 √
n
n
is a 100(1 − α)% (approximate) confidence interval for θ.
Notes:
a) For repeated runs, θ ∈ Cα 100(1 − α)% of the time;
b) If n is small, tα/2,n−1, from t−distribution, can be used.
5
SIMULATED DATA ANALYSIS CONT.
• Bernoulli p
Case: if Xi = 1 or 0 with X̄ ≈ θ = p, then
S ≈ σ = p(1 − p), so an α-confidence interval for p is
p
p
Cα = pn − zα/2 pn(1 − pn)/n, pn + zα/2 pn(1 − pn)/n
Examples
Repair Simulation: some Matlab results
N = 100;
for i= 1:N, X(i) = repair(4,3,1,2); end, disp(mean(X))
1.5414
N = 1000;
for i= 1:N, X(i) = repair(4,3,1,2); end, disp(mean(X))
1.5655
for i= 1:N, X(i) = repair(4,3,1,2); end, disp(mean(X))
1.4775
for i=1:N,X(i)=repair(4,3,1,2);end,disp([mean(X) 2*std(X)/sqrt(N)])
1.5864
0.072857
for i=1:N,X(i)=repair(4,3,1,2);end,disp([mean(X) 2*std(X)/sqrt(N)])
1.5497
0.067054
clear X, N=100;
for i=1:N,X(i)=repair(4,3,1,2);end,disp([mean(X) 2*std(X)/sqrt(N)])
1.6578
0.27878
disp(std(X))
1.3939
clear X, N=10000;
for i=1:N,X(i)=repair(4,3,1,2);end,disp([mean(X) 2*std(X)/sqrt(N)])
1.5482
0.021507
disp(std(X))
1.0753
clear X, N=100;
for i= 1:N,X(i)=insrnc(365);end,disp([mean(X) 2*std(X)/sqrt(N)])
0.11
0.062893
p
Note: 2 .11(1 − .11)/100 ≈ .0626.
6