Download Monte Carlo analysis I: random sampling - CMU (ECE)

18-660: Numerical Methods for Engineering Design and Optimization Xin Li Department of ECE Carnegie Mellon University Pittsburgh, PA 15213 Slide 1 Overview  Monte Carlo Analysis Random variable  Probability distribution  Random sampling  Slide 2 Random Variables  A random variable is a real-valued function of the outcome of the experiment +1.0 (Experiment 1) x (random variable) = −0.3 (Experiment 2) ... −2.4 (Experiment N) We get different results from different experiments (i.e., the output is random) Slide 3 Probability Distribution   A continuous random variable x is defined by its probability distribution function Probability density function (PDF)  pdfx(tx) denotes the probability per unit length near x = tx b ∫ pdf (τ )⋅ dτ x x x = P(a ≤ x ≤ b ) PDF 0 x a  Cumulative distribution function (CDF)  cdfx(tx) equals the probability of x ≤ tx cdf x (t x ) = tx ∫ pdf (τ )⋅ dτ x x x = P(x ≤ t x ) CDF 0 x −∞ Slide 4 Expectation  Given a random variable x and a function f(x), the expectation of f(x) is the weighted average of the possible values of f(x) E [ f (x )] = +∞ ∫ f (τ )⋅ pdf (τ )⋅ dτ x x x x −∞  A useful equation for expected value calculation E [ f ( x ) + g ( x )] = E [ f ( x )] + [g ( x )] Slide 5 Mean, Variance and Standard Deviation  Mean +∞ E [x ] = ∫ τ x ⋅ pdf x (τ x ) ⋅ dτ x −∞  Variance [ ] ∫ [τ VAR[x ] = E ( x − E [x ]) = 2 +∞ x − E ( x )] ⋅ pdf x (τ x ) ⋅ dτ x 2 −∞  Standard deviation STD[x ] = VAR[x ] VAR[x] is always positive! Slide 6 Mean, Variance and Standard Deviation  Mean measures the “average position” of x PDF1 0  PDF2 Small mean 0 x Large mean x Variance measures the “spread” of the distribution PDF1 PDF2 Δ 0 Small variance x 0 Large variance x Slide 7 Moments and Central Moments  k-th order moment [ ] +∞ E x = ∫ τ xk ⋅ pdf x (τ x ) ⋅ dτ x k −∞   Mean is the first order moment k-th order central moments [ ] ∫ [τ E ( x − E [x ]) = k +∞ x − E ( x )] ⋅ pdf x (τ x ) ⋅ dτ x k −∞  Variance is the second order central moment Slide 8 Normal Distribution A random variable x is Normal if pdf x (t ) = 1 σ 2π ⋅e 0.35 2σ 2 0.3 μ: mean  σ: standard deviation  Denoted as N(μ, σ2)  0.4 − (t − µ )2 0.25 PDF  0.2 0.15 0.1 0.05 0 -5 0 x 5 Standard Normal distribution  If μ = 0 and σ = 1, it is called standard Normal distribution Why is Normal distribution important to us? Slide 9 Normal Distribution   Many physical variations are Normal Central limit theorem: the variation caused by a large number of independent random factors is “almost” Normal 600 1000 1500 400 200 # of Samples # of Samples # of Samples 800 600 400 1000 500 200 0 -0.5 0 y y = x1 0.5 0 -1 -0.5 0 y 0.5 y = x1 + x2 1 0 -4 -2 0 y 2 4 y = x1 + x2 + x3 + x4 + x5 Assume that all xi’s are independent and have the same uniform distribution Slide 10 Multiple Random Variables   Two continuous random variables x and y are defined by their joint probability distribution Joint probability density function ∫ ∫ pdf (τ d b x, y x ,τ y )⋅ dτ x ⋅ dτ y = P(a ≤ x ≤ b, c ≤ y ≤ d ) c a  Joint cumulative distribution function cdf x , y (t x , t y ) = P (x ≤ t x , y ≤ t y ) = tx t y ∫ ∫ pdf (τ x, y x , τ y ) ⋅ dτ x ⋅ dτ y − ∞− ∞ Applicable to more than two random variables Slide 11 Joint Probability Distribution  Example: bivariate Normal distribution Joint probability density function Joint cumulative distribution function Slide 12 Marginal Distribution Function  Marginal probability density function pdf x (t x ) = +∞ ∫ pdf (t ,τ )⋅ dτ x, y x y y −∞ pdf y (t y ) = +∞ ∫ pdf (τ x, y x , t y )⋅ dτ x −∞  Marginal cumulative distribution function cdf x (t x ) = P( x ≤ t x , y ≤ +∞ ) = lim cdf x , y (t x , t y ) t y → +∞ cdf y (t y ) = P (x ≤ +∞, y ≤ t y ) = lim cdf x , y (t x , t y ) t x → +∞ Slide 13 Marginal Distribution Function  Example: bivariate Normal distribution Marginal PDF for x Marginal PDF for y Slide 14 Covariance and Correlation  Covariance COV [x, y ] = E [( x − E [x ]) ⋅ ( y − E [ y ])]   If COV[x,y] = 0, then x and y are uncorrelated Covariance matrix COV [x, x ] COV [x, y ] Σ=  [ ] [ ] , , COV y x COV y y   Σ is always symmetric  Diagonal components are corresponding to variance values  Σ is diagonal if x and y are uncorrelated  Slide 15 Covariance and Correlation Correlation (normalized covariance) COR[x, y ] =  COV [x, y ] STD[x ]⋅ STD[ y ] Correlation between two random variables can be visualized by scatter plot 4 2 y  Zero correlation 0 -2 -4 -4 -2 0 x 2 4 Slide 16 Covariance and Correlation Example: correlated random variables 4 4 2 2 0 0 y y  Positive correlation -2 -4 -4 -2 0 x 2 -2 4 -4 -4 Negative correlation -2 0 x 2 4 Slide 17 Monte Carlo Analysis  Problem definition  Find probability distribution and/or moments of f (X ) Function of interest  Random variable with known distribution In general, the distribution and/or moments of f cannot be calculated analytically, because f(X) is nonlinear  f(X) may not have closed-form expression (we can only numerically calculate f for a given X value)  Slide 18 Monte Carlo Analysis  Monte Carlo analysis for f(X) Randomly select M samples for X  Evaluate function f(X) at each sampling point  Estimate distribution of f using these M samples  Random samples {X(1), X(2), ...} Evaluate f(X) Samples of f(X) Distribution of f(X) Slide 19 Monte Carlo Analysis Example  Example: estimate the probability distribution of y = exp( x )  x ~ N(0,1) (standard Normal distribution) Slide 20 Monte Carlo Analysis Example  Step 1: draw random samples for x Samples 1 2 3 4 5 6 ... x -0.4326 -1.6656 0.1253 0.2877 -1.1465 1.1909 ... M random samples for x  Step 2: calculate y at each sampling point Samples 1 2 3 4 5 6 ... y 0.6488 0.1891 1.1335 1.3333 0.3178 3.2901 ... M random samples for y Slide 21 Monte Carlo Analysis Result Monte Carlo result is typically represented by a histogram  A big table of data is not intuitive 500 Number of Samples  400 300 200 100 0 0 5 10 y 15 20 Histogram of y based on 1000 random samples QUESTION: how accurate is Monte Carlo analysis? Slide 22 Monte Carlo Analysis Accuracy  Monte Carlo analysis is not deterministic We cannot get identical results when running MC twice  The analysis error is not deterministic   Monte Carlo accuracy depends on the number of samples Examples: histogram of y 500 6000 20 400 5000 15 10 5 0 0 2 4 6 y 100 samples 8 10 Number of Samples 25 Number of Samples Number of Samples  300 200 100 0 0 5 10 y 15 1000 samples 20 4000 3000 2000 1000 0 0 5 10 y 15 20 10000 samples Slide 23 Monte Carlo Analysis Accuracy  Example: bivariate Normal distribution  x and y are independent and jointly standard Normal Slide 24 Monte Carlo Analysis Accuracy  Statistical methods exist to analyze Monte Carlo accuracy  Example: Monte Carlo accuracy analysis x ~ N (0,1) Standard Normal distribution Estimate the mean value μx by Monte Carlo analysis  Our question: how accurate is the estimated μx (dependent on the number of Monte Carlo samples)?  Slide 25 Monte Carlo Analysis Accuracy  Monte Carlo analysis for the mean value μx Randomly draw M sampling points {x(1),x(2),...,x(M)}  Estimate μx by the following equation  x (1) + x (2 ) +  + x ( M ) µx = M  Called an estimator Assumptions in our accuracy analysis Each x(i) is random and satisfies standard Normal distribution – it is randomly created for x ~ N(0,1)  All x(i)’s are mutually independent – samples from a good random number generator should be independent   μx is a function of {x(1),x(2),...,x(M)}, which is a random variable Slide 26 Monte Carlo Analysis Accuracy  Mean of μx { } { } { }  x (1) + x (2 ) +  + x ( M )  E x (1) + E x (2 ) +  + E x ( M ) E{µ x } = E  =0 = M M   E{x(i)} = 0  Variance of μx { } Eµ 2 x { } { } { }  x (1) + x (2 ) +  + x ( M )  2  E x (1)2 + E x (2 )2 +  + E x ( M )2 1 = E  =  = 2 M M M    x(i)’s are independent E{x(i)2} = 1 Slide 27 Monte Carlo Analysis Accuracy  E{μx} = 0 ux is an unbiased estimator  Otherwise, if the estimator mean is not equal to the actual mean, it is called a biased estimator   E{μx2} = 1/M Variance decreases as M increases  Distributions of μx for different M values  0.4 1.5 4 0.3 3 0.2 PDF PDF PDF 1 2 0.5 0.1 0 -5 1 0 µx 1 sample 5 0 -5 0 µx 5 0 -5 10 samples μx is a Normal distribution N(0, 1/M) 0 µx 5 100 samples Slide 28 Monte Carlo Analysis Accuracy  “Average” estimation accuracy is better when using larger M  In this μx example  If we require that ±3 sigma of μx is within [-0.1, 0.1] 3 ≤ 0.1 M  M ≥ 900 If we require that ±3 sigma of μx is within [-0.01, 0.01] 3 M ≤ 0.01 M ≥ 90000 Slide 29 Monte Carlo Analysis Accuracy    Accuracy is improved by 10x if the number of samples is increased by 100x 1K ~ 10K sampling points are typically required to achieve reasonable accuracy However, even if you use 10K sampling points, an accurate result is not guaranteed!  Monte Carlo analysis is random, and you can be unlucky (e.g., going beyond ±3 sigma range) Slide 30 Summary  Monte Carlo analysis Random variable  Probability distribution  Random sampling  Slide 31

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Monte Carlo analysis I: random sampling - CMU (ECE)