Download Simulation Wrap-up, Statistics COS 323

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
Simulation Wrap-up,
Statistics
COS 323
Today
•  Simulation Re-cap
•  Statistics
•  Variance and confidence intervals for
simulations
•  Simulation wrap-up
•  FYI: No class or office hours Thursday
Simulation wrap-up
Last time
•  Time-driven, event-driven
•  “Simulation” from differential equations
•  Cellular automata, microsimulation, agentbased simulation
–  see e.g.
http://www.microsimulation.org/IMA/What%20is
%20microsimulation.htm
•  Example applications: SIR disease model,
population genetics
Simulation: Pros and Cons
•  Pros:
– 
– 
– 
– 
Building model can be easy (easier) than other approaches
Outcomes can be easy to understand
Cheap, safe
Good for comparisons
•  Cons:
– 
– 
– 
– 
Hard to debug
No guarantee of optimality
Hard to establish validity
Can’t produce absolute numbers
Simulation: Important Considerations
•  Are outcomes statistically significant? (Need
many simulation runs to assess this)
•  What should initial state be?
•  How long should the simulation run?
•  Is the model realistic?
•  How sensitive is the model to parameters,
initial conditions?
Statistics Overview
Random Variables
•  A random variable is any “probabilistic outcome”
–  e.g., a coin flip, height of someone randomly chosen from a
population
•  A R.V. takes on a value in a sample space
–  space can be discrete, e.g., {H, T}
–  or continuous, e.g. height in (0, infinity)
•  R.V. denoted with capital letter (X), a realization with
lowercase letter (x)
–  e.g., X is a coin flip, x is the value (H or T) of that coin flip
Probability Mass Function
•  Describes probability for a discrete R.V.
•  e.g.,
Probability Density Function
•  Describes probability for a continuous R.V.
•  e.g.,
[Population] Mean of a Random Variable
•  aka expected value, first moment
•  for discrete RV:
E [ X ] = µ = ∑ x i pi
i
•  for continuous RV:
€
E[ X ] = µ =
∫
∞
−∞
x p(x) dx
[Population] Variance
σ = E [(X − µ)
2
2
]
= E [ X − 2Xµ + µ
2
= E[ X
]−µ
= E [ X ] − (E [ X ])
2
]
2
2
•  for discrete RV:
2
2
σ = ∑ pi (x i − µ)
2
2
i
€
•  for continuous RV:
σ2 =
2
(x
−
µ
)
p(x) dx
∫
Sample mean and sample variance
•  Suppose we have N independent
observations of X: x1, x2, …xN
•  Sample mean:
N
1
xi = x
∑
N i=1
E[x ] = µ
•  Sample variance:
N
€
1
2
2
(x i − x ) = s
∑
N −1 i=1
€
2
E[s ] = σ
2
1/(N-1) and the sample variance
•  The N differences
x i − x are not independent:
∑ (x
i
− x) = 0
•  If you know N-1 of these values, you can deduce the
last one€
–  i.e., only N-1 degrees of freedom
€ treat sample as population and compute
•  Could
N
population variance:
1
2
(x
∑
N
i=1
i
− x)
–  BUT this underestimates true population variance (especially
bad if sample is small)
€
Sample variance using 1/(N-1) is unbiased
N
⎡
⎤
1
2
2
E [ s ] = E ⎢
(x i − x ) ⎥
∑
⎣ N −1 i=1
⎦
⎡ N 2
⎤
1
2
=
E ⎢∑ x i − Nx ⎥
N −1 ⎣ i=1
⎦
2
⎛
⎞ ⎤
1 ⎡
σ
2
2
2
=
⎢N (σ + µ ) − N ⎜ + µ ⎟ ⎥
N −1 ⎣
⎝ N
⎠ ⎦
=σ2
€
Computing sample variance
•  Can compute as
N
1
2
2
s =
(x
−
x
)
∑
i
N −1 i=1
•  Prefer:
€
⎛ N 2 ⎞
⎜∑ x i ⎟ − N(x ) 2
⎝ i=1 ⎠
2
s =
N −1
The Gaussian Distribution
1
p(x) =
e
σ 2π
E[X] = µ
Var[X] = σ 2
1 ⎛ x − µ ⎞
⎜
⎟
2 ⎝ σ ⎠
2
Why so important?
•  sum of independent observations of a
random variable converges to Gaussian
•  in nature, events having variations resulting
from many small, independent effects tend to
have Gaussian distributions
–  demo: http://www.mongrav.org/math/falling-ballsprobability.htm
–  e.g., measurement error
–  if effects are multiplicative, logarithm is often
normally distributed
Central Limit Theorem
•  Suppose we sample x1, x2, … xN from a
distribution with mean μ and variance σ2
•  Let
•  then
€
N
1
x = ∑ xi
N i=1
x −µ
z=
→N(0,1)
σ/ N
•  i.e., x distributed normally with mean μ,
variance σ2/N
€
Important Properties of Normal Distribution
1. Family of normal distributions closed under linear
transformations:
if X ~ N(μ, σ2) then
(aX + b) ~ N(aμ+b, a2σ2)
2. Linear combination of normals is also normal:
if X1 ~ N(μ1, σ12) and X2 ~ N(μ2, σ22) then
aX1+bX2 ~ N(aμ1 + bμ2, a2σ12 + b2σ22)
Important Properties of Normal Distribution
3. Of all distributions with mean and variance, normal
has maximum entropy
Information theory: Entropy like “uninformativeness”
Principle of maximum entropy: choose to represent
the world with as uninformative a distribution as
possible, subject to “testable information”
If we know x is in [a, b], then uniform distribution on [a,
b] has least entropy
If we know distribution has mean µ, variance σ2, normal
distribution N(µ, σ2) has least entropy
Important Properties of Normal Distribution
4. If errors are normally distributed, a least-squares fit
yields the maximum likelihood estimator
Finding least-squares x st Ax ≈ b finds the value of x
that maximizes the likelihood of data A under some
model
Important Properties of Normal Distribution
5. Many derived random variables have
analytically-known densities
e.g., sample mean, sample variance
6. Sample mean and variance of n identical
independent samples are independent;
sample mean is a normally-distributed
random variable
2
X n ~ N( µ, σ /n)
Distribution of Sample Variance
(For Gaussian R.V. X)
N
1
2
s =
(x i − x )
∑
N −1 i=1
2
(n −1)s2
define U =
σ2
then U has a χ2 distribution with (n -1) d.o.f.
⎡ n / 2 ⎛ n ⎞ ⎤
p(x) = ⎢2 Γ⎜ ⎟ ⎥
⎝ 2 ⎠ ⎦
⎣
−1
n
2 −1
( x) e
−x / 2
, x ≥0
E [U ] = n −1, Var [U ] = 2(n −1)
The Chi-Squared Distribution
What if we don’t know true variance?
•  Sample mean is normally distributed R.V.
2
X n ~ N( µ, σ /n)
•  Taking advantage of this presumes we know
σ2
x −µ
€
• 
sn / n
€
has a t distribution with (n-1) d.o.f.
[Student’s] t-distribution
Forming a confidence interval
•  e.g., given that I observed a sample mean of ____,
I’m 99% confident that the true mean lies between
____ and ____.
•  Know that x − µ
has t distribution
sn / n
•  Choose q1, q2 such that student t with (n-1) dof has
99% probability of lying between q1, q2
€
Confidence interval for the mean
⎧
⎫
xn − µ
if P ⎨q1 <
< q2 ⎬ = 0.99
sn / n
⎩
⎭
⎧
sn
sn ⎫
⎬ = 0.99
then P ⎨ x n − q2
< µ < x n − q1
⎩
n
n ⎭
€
Interpreting Simulation Outcomes
•  How long will customers have to wait, on
average?
–  e.g., for given # tellers, arrival rate, service time
distribution, etc.
•  Simulate bank for N customers
•  Let xi be the wait time of customer i
•  Is mean(x) a good estimate for µ?
•  How to compute a 95% confidence interval
for µ ?
–  Problem: xi are not independent!
Replications
•  Run simulation to get M observations
•  Repeat simulation N times (different random
numbers each time)
•  Treat the sample mean of different runs as
approximately uncorrelated
1
2
s =
(X i − X )
∑
n −1 i
2
Batch Means
•  Run simulation for N (large)
•  Divide xi into k consecutive batches of size b
•  If b large enough, mean(batch1) approx.
uncorrelated with mean(batch2), etc.
Other approaches
•  Use estimation of autocorrelation between xis
to derive better estimate of variance that can
be used for confidence interval
•  Regenerative method: Take advantage of
“regeneration points” or cycles in behavior
–  e.g., points when bank is empty of customers
Simulation Wrap-up
Finally…
Implications
•  Who designed it all?
•  How should we behave?
•  What if we start running too many of our own
simulations?
Software
•  http://en.wikipedia.org/wiki/
List_of_computer_simulation_software