Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
ANALYSIS of SIMULATED DATA Sample Mean and Sample Variance • Assume iid RVs X1, X2, . . . , Xn as estimates of some quantity of interest are produced by n runs of some simulation. P • Sample mean X̄ = n1 N Xi is used estimate population i=1 P mean θ because E[X̄] = n1 ni=1 E[Xi] = n1 nθ = θ. • Question: how good is X̄? Answer: try to use V ar(X̄): V ar(X̄) = E[(θ − X̄)2] n 1X = V ar( Xi ) n i=1 n 1 X V ar(Xi)) = 2 n i=1 1 2 σ2 = 2 nσ = , n n and use Central Limit Theorem: σ P {|X̄ − θ| > c √ } ≈ P {|Z| > c} = 2(1 − Φ(c)) n Eg. if c = 1.96, 2(1 − Φ(c)) ≈ .05, if c = 2.58, 2(1 − Φ(c)) ≈ .01. Problem: σ 2 is not known 1 SIMULATED DATA ANALYSIS CONT. • Solution: estimate σ 2 using the sample variance; P Start with σ 2 = E[(X − θ)2] ≈ n1 ni=1(Xi − X̄)2? For statistical validity, check n n X X E[ (Xi − X̄)2] = E[ Xi2] − nE[X̄ 2] i=1 i=1 nE[X 2] − nE[X̄ 2] n(V ar(X) + E[X]2) − n(V ar(X̄) + E[X̄]2) n(V ar(X) + E[X]2) − n(σ 2/n + θ2) (n − 1)σ 2 Pn 1 2 So define sample variance S = n−1 i=1(Xi − X̄)2, q Pn 1 sample standard deviation S = n−1 i=1(Xi − X̄)2 ≈ σ, √ and sample standard error S/ n. • Simulation Stopping Method: given α and error tolerance δ, P j 1 2 let Sj2 = j−1 i=1 (Xi − X̄j ) . √ Algorithm: sample Xi for i = 1, . . . , k until cSk / k < δ. = = = = √ If cSk > δ k, when should stop occur? Stop should occur at k ∗ ≈ c2Sk2/δ 2. Example a) if c = 2, δ = .01, Sk = .1, k ∗ =? 2 SIMULATED DATA ANALYSIS CONT. Example b) N = 100; for i= 1:N X(i) = repair(4,3,1,2); % Crash time end, disp([mean(X) std(X)]) 1.5652 1.1066 if c = 2, δ = .01, k ∗ =? for N = 50000, X̄ ≈ 1.5545, S ≈ 1.070; another run: for N = 50000, X̄ ≈ 1.5494, S ≈ 1.078; 3 SIMULATED DATA ANALYSIS CONT. Pj 1 2 • Computation of S : X̄j = j i=1 Xi. Pj 1 2 Sj = j−1 ( i=1 Xi2 − j X̄j2) can be numerically unstable. Consider iterative computation of Sj : starting with X0 = S0 = S1 = 0, then 1 X̄j+1 = X̄j + (Xj+1 − X̄j ), j+1 j−1 2 2 )Sj + (j + 1)(X̄j+1 − X̄j )2. Sj+1 =( j Also 2 Sj+1 j − 1 Sj2 =( ) + (X̄j+1 − X̄j )2. j+1 j+1 j 4 SIMULATED DATA ANALYSIS CONT. Interval Estimates for the Population Mean √ • Assumption: for large n, n(X̄ − θ)/S ∼ N ormal(0, 1). Given α, 0 < α < 1, let α = P {Z > zα}, with Z ∼ N ormal(0, 1), so zα = Φ−1(α), then 1 − α = P {−zα/2 < Z < zα/2} √ X̄ − θ 1 − α = P {−zα/2 < n < zα/2} σ √ X̄ − θ < zα/2} 1 − α ≈ P {−zα/2 < n S S S 1 − α ≈ P {X̄ − zα/2 √ < θ < X̄ + zα/2 √ } n n S S E.g. 1 − .05 ≈ P {X̄ − 1.96 √ < θ < X̄ + 1.96 √ } n n • Confidence Interval: given the sample mean X̄, and the sample standard deviation S, the interval S S Cα = X̄ − zα/2 √ , X̄ + zα/2 √ n n is a 100(1 − α)% (approximate) confidence interval for θ. Notes: a) For repeated runs, θ ∈ Cα 100(1 − α)% of the time; b) If n is small, tα/2,n−1, from t−distribution, can be used. 5 SIMULATED DATA ANALYSIS CONT. • Bernoulli p Case: if Xi = 1 or 0 with X̄ ≈ θ = p, then S ≈ σ = p(1 − p), so an α-confidence interval for p is p p Cα = pn − zα/2 pn(1 − pn)/n, pn + zα/2 pn(1 − pn)/n Examples Repair Simulation: some Matlab results N = 100; for i= 1:N, X(i) = repair(4,3,1,2); end, disp(mean(X)) 1.5414 N = 1000; for i= 1:N, X(i) = repair(4,3,1,2); end, disp(mean(X)) 1.5655 for i= 1:N, X(i) = repair(4,3,1,2); end, disp(mean(X)) 1.4775 for i=1:N,X(i)=repair(4,3,1,2);end,disp([mean(X) 2*std(X)/sqrt(N)]) 1.5864 0.072857 for i=1:N,X(i)=repair(4,3,1,2);end,disp([mean(X) 2*std(X)/sqrt(N)]) 1.5497 0.067054 clear X, N=100; for i=1:N,X(i)=repair(4,3,1,2);end,disp([mean(X) 2*std(X)/sqrt(N)]) 1.6578 0.27878 disp(std(X)) 1.3939 clear X, N=10000; for i=1:N,X(i)=repair(4,3,1,2);end,disp([mean(X) 2*std(X)/sqrt(N)]) 1.5482 0.021507 disp(std(X)) 1.0753 clear X, N=100; for i= 1:N,X(i)=insrnc(365);end,disp([mean(X) 2*std(X)/sqrt(N)]) 0.11 0.062893 p Note: 2 .11(1 − .11)/100 ≈ .0626. 6