Download Monte Carlo analysis I: random sampling - CMU (ECE)

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
18-660: Numerical Methods for
Engineering Design and Optimization
Xin Li
Department of ECE
Carnegie Mellon University
Pittsburgh, PA 15213
Slide 1
Overview

Monte Carlo Analysis
Random variable
 Probability distribution
 Random sampling

Slide 2
Random Variables

A random variable is a real-valued function of the outcome of
the experiment
+1.0 (Experiment 1)
x (random variable) =
−0.3 (Experiment 2)
...
−2.4 (Experiment N)
We get different results from different experiments (i.e.,
the output is random)
Slide 3
Probability Distribution


A continuous random variable x is defined by its probability
distribution function
Probability density function (PDF)

pdfx(tx) denotes the probability per unit
length near x = tx
b
∫ pdf (τ )⋅ dτ
x
x
x
= P(a ≤ x ≤ b )
PDF
0
x
a

Cumulative distribution function (CDF)

cdfx(tx) equals the probability of x ≤ tx
cdf x (t x ) =
tx
∫ pdf (τ )⋅ dτ
x
x
x
= P(x ≤ t x )
CDF
0
x
−∞
Slide 4
Expectation

Given a random variable x and a function f(x), the expectation
of f(x) is the weighted average of the possible values of f(x)
E [ f (x )] =
+∞
∫ f (τ )⋅ pdf (τ )⋅ dτ
x
x
x
x
−∞

A useful equation for expected value calculation
E [ f ( x ) + g ( x )] = E [ f ( x )] + [g ( x )]
Slide 5
Mean, Variance and Standard Deviation

Mean
+∞
E [x ] = ∫ τ x ⋅ pdf x (τ x ) ⋅ dτ x
−∞

Variance
[
] ∫ [τ
VAR[x ] = E ( x − E [x ]) =
2
+∞
x − E ( x )] ⋅ pdf x (τ x ) ⋅ dτ x
2
−∞

Standard deviation
STD[x ] = VAR[x ]
VAR[x] is always positive!
Slide 6
Mean, Variance and Standard Deviation

Mean measures the “average position” of x
PDF1
0

PDF2
Small mean
0
x
Large mean
x
Variance measures the “spread” of the distribution
PDF1
PDF2
Δ
0
Small variance
x
0
Large variance
x
Slide 7
Moments and Central Moments

k-th order moment
[ ]
+∞
E x = ∫ τ xk ⋅ pdf x (τ x ) ⋅ dτ x
k
−∞


Mean is the first order moment
k-th order central moments
[
] ∫ [τ
E ( x − E [x ]) =
k
+∞
x − E ( x )] ⋅ pdf x (τ x ) ⋅ dτ x
k
−∞

Variance is the second order central moment
Slide 8
Normal Distribution
A random variable x is Normal if
pdf x (t ) =
1
σ 2π
⋅e
0.35
2σ 2
0.3
μ: mean
 σ: standard deviation
 Denoted as N(μ, σ2)

0.4
− (t − µ )2
0.25
PDF

0.2
0.15
0.1
0.05
0
-5
0
x
5
Standard Normal distribution

If μ = 0 and σ = 1, it is called standard Normal distribution
Why is Normal distribution important to us?
Slide 9
Normal Distribution


Many physical variations are Normal
Central limit theorem: the variation caused by a large number
of independent random factors is “almost” Normal
600
1000
1500
400
200
# of Samples
# of Samples
# of Samples
800
600
400
1000
500
200
0
-0.5
0
y
y = x1
0.5
0
-1
-0.5
0
y
0.5
y = x1 + x2
1
0
-4
-2
0
y
2
4
y = x1 + x2 + x3 + x4 + x5
Assume that all xi’s are independent and have the same uniform distribution
Slide 10
Multiple Random Variables


Two continuous random variables x and y are defined by their
joint probability distribution
Joint probability density function
∫ ∫ pdf (τ
d b
x, y
x
,τ y )⋅ dτ x ⋅ dτ y = P(a ≤ x ≤ b, c ≤ y ≤ d )
c a

Joint cumulative distribution function
cdf x , y (t x , t y ) = P (x ≤ t x , y ≤ t y ) =
tx t y
∫ ∫ pdf (τ
x, y
x
, τ y ) ⋅ dτ x ⋅ dτ y
− ∞− ∞
Applicable to more than two random variables
Slide 11
Joint Probability Distribution

Example: bivariate Normal distribution
Joint probability density function
Joint cumulative distribution function
Slide 12
Marginal Distribution Function

Marginal probability density function
pdf x (t x ) =
+∞
∫ pdf (t ,τ )⋅ dτ
x, y
x
y
y
−∞
pdf y (t y ) =
+∞
∫ pdf (τ
x, y
x
, t y )⋅ dτ x
−∞

Marginal cumulative distribution function
cdf x (t x ) = P( x ≤ t x , y ≤ +∞ ) = lim cdf x , y (t x , t y )
t y → +∞
cdf y (t y ) = P (x ≤ +∞, y ≤ t y ) = lim cdf x , y (t x , t y )
t x → +∞
Slide 13
Marginal Distribution Function

Example: bivariate Normal distribution
Marginal
PDF for x
Marginal
PDF for y
Slide 14
Covariance and Correlation

Covariance
COV [x, y ] = E [( x − E [x ]) ⋅ ( y − E [ y ])]


If COV[x,y] = 0, then x and y are uncorrelated
Covariance matrix
COV [x, x ] COV [x, y ]
Σ=

[
]
[
]
,
,
COV
y
x
COV
y
y


Σ is always symmetric
 Diagonal components are corresponding to variance values
 Σ is diagonal if x and y are uncorrelated

Slide 15
Covariance and Correlation
Correlation (normalized covariance)
COR[x, y ] =

COV [x, y ]
STD[x ]⋅ STD[ y ]
Correlation between two random variables can be visualized
by scatter plot
4
2
y

Zero
correlation
0
-2
-4
-4
-2
0
x
2
4
Slide 16
Covariance and Correlation
Example: correlated random variables
4
4
2
2
0
0
y
y

Positive
correlation
-2
-4
-4
-2
0
x
2
-2
4
-4
-4
Negative
correlation
-2
0
x
2
4
Slide 17
Monte Carlo Analysis

Problem definition

Find probability distribution and/or moments of
f (X )
Function of
interest

Random variable with
known distribution
In general, the distribution and/or moments of f cannot be
calculated analytically, because
f(X) is nonlinear
 f(X) may not have closed-form expression (we can only
numerically calculate f for a given X value)

Slide 18
Monte Carlo Analysis

Monte Carlo analysis for f(X)
Randomly select M samples for X
 Evaluate function f(X) at each sampling point
 Estimate distribution of f using these M samples

Random samples
{X(1), X(2), ...}
Evaluate
f(X)
Samples of f(X)
Distribution of
f(X)
Slide 19
Monte Carlo Analysis Example

Example: estimate the probability distribution of
y = exp( x )

x ~ N(0,1) (standard Normal distribution)
Slide 20
Monte Carlo Analysis Example

Step 1: draw random samples for x
Samples
1
2
3
4
5
6
...
x
-0.4326
-1.6656
0.1253
0.2877
-1.1465
1.1909
...
M random samples for x

Step 2: calculate y at each sampling point
Samples
1
2
3
4
5
6
...
y
0.6488
0.1891
1.1335
1.3333
0.3178
3.2901
...
M random samples for y
Slide 21
Monte Carlo Analysis Result
Monte Carlo result is typically represented by a histogram

A big table of data is not intuitive
500
Number of Samples

400
300
200
100
0
0
5
10
y
15
20
Histogram of y based on 1000 random samples
QUESTION: how accurate is Monte Carlo analysis?
Slide 22
Monte Carlo Analysis Accuracy

Monte Carlo analysis is not deterministic
We cannot get identical results when running MC twice
 The analysis error is not deterministic


Monte Carlo accuracy depends on the number of samples
Examples: histogram of y
500
6000
20
400
5000
15
10
5
0
0
2
4
6
y
100 samples
8
10
Number of Samples
25
Number of Samples
Number of Samples

300
200
100
0
0
5
10
y
15
1000 samples
20
4000
3000
2000
1000
0
0
5
10
y
15
20
10000 samples
Slide 23
Monte Carlo Analysis Accuracy

Example: bivariate Normal distribution

x and y are independent and jointly standard Normal
Slide 24
Monte Carlo Analysis Accuracy

Statistical methods exist to analyze Monte Carlo accuracy

Example: Monte Carlo accuracy analysis
x ~ N (0,1)
Standard Normal distribution
Estimate the mean value μx by Monte Carlo analysis
 Our question: how accurate is the estimated μx (dependent on
the number of Monte Carlo samples)?

Slide 25
Monte Carlo Analysis Accuracy

Monte Carlo analysis for the mean value μx
Randomly draw M sampling points {x(1),x(2),...,x(M)}
 Estimate μx by the following equation

x (1) + x (2 ) +  + x ( M )
µx =
M

Called an estimator
Assumptions in our accuracy analysis
Each x(i) is random and satisfies standard Normal distribution –
it is randomly created for x ~ N(0,1)
 All x(i)’s are mutually independent – samples from a good
random number generator should be independent


μx is a function of {x(1),x(2),...,x(M)}, which is a random variable
Slide 26
Monte Carlo Analysis Accuracy

Mean of μx
{ } { }
{ }
 x (1) + x (2 ) +  + x ( M )  E x (1) + E x (2 ) +  + E x ( M )
E{µ x } = E 
=0
=
M
M


E{x(i)} = 0

Variance of μx
{ }
Eµ
2
x
{ } { }
{
}
 x (1) + x (2 ) +  + x ( M )  2  E x (1)2 + E x (2 )2 +  + E x ( M )2
1
= E 
=
 =
2
M
M
M

 
x(i)’s are independent
E{x(i)2} = 1
Slide 27
Monte Carlo Analysis Accuracy

E{μx} = 0
ux is an unbiased estimator
 Otherwise, if the estimator mean is not equal to the actual
mean, it is called a biased estimator


E{μx2} = 1/M
Variance decreases as M increases
 Distributions of μx for different M values

0.4
1.5
4
0.3
3
0.2
PDF
PDF
PDF
1
2
0.5
0.1
0
-5
1
0
µx
1 sample
5
0
-5
0
µx
5
0
-5
10 samples
μx is a Normal distribution N(0, 1/M)
0
µx
5
100 samples
Slide 28
Monte Carlo Analysis Accuracy

“Average” estimation accuracy is better when using larger M

In this μx example

If we require that ±3 sigma of μx is within [-0.1, 0.1]
3
≤ 0.1
M

M ≥ 900
If we require that ±3 sigma of μx is within [-0.01, 0.01]
3
M
≤ 0.01
M ≥ 90000
Slide 29
Monte Carlo Analysis Accuracy



Accuracy is improved by 10x if the number of samples is
increased by 100x
1K ~ 10K sampling points are typically required to achieve
reasonable accuracy
However, even if you use 10K sampling points, an accurate
result is not guaranteed!

Monte Carlo analysis is random, and you can be unlucky (e.g.,
going beyond ±3 sigma range)
Slide 30
Summary

Monte Carlo analysis
Random variable
 Probability distribution
 Random sampling

Slide 31
Related documents