* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Lecture 19 (Mar. 24)
Mathematical optimization wikipedia , lookup
Corecursion wikipedia , lookup
Computational phylogenetics wikipedia , lookup
Pattern recognition wikipedia , lookup
Generalized linear model wikipedia , lookup
Inverse problem wikipedia , lookup
Data assimilation wikipedia , lookup
Expectation–maximization algorithm wikipedia , lookup
MP/BME 574 Lecture 19: Random number generators and Monte Carlo methods Learning Objectives: Random number generators Introduction to the Monte Carlo methods Iterative methods and “blind” deconvolution Assignment: 1. Read “An iterative technique for the rectification of observed distributions,” by Lucy et al. available on the website in pdf. 2. Park and Miller, “Random number generators: good ones are hard to find.” I. Random number generators a. Most common algorithm is Lehmer’s algorithm i. Iterative method z n 1 a z n mod( m) u n 1 z n 1 m z o " seed " m Large prime number, period length(m - 1). (if not prime then algorithm collapses to zero). b. “pseudorandom” number generator i. Use the same seed, then get the same series ii. Repeats with period m-1. >> z(1) = 1; m = 13; a = 6; for i = 1:20, z(i+1) = mod(a*z(i),m); %u(i+1)=z(i+1)/m; end >> z z= 1 6 10 8 9 2 12 7 3 5 4 11 1 6 10 8 9 2 12 7 3 iii. Matlab e.g. 1. “rand()” or just “rand” returns a single random number 2. rand(n) returns an n × n matrix of random numbers 3. Same sequence length(m-1) long until “state” is reset, i.e. a new seed is introduced. “help rand” RAND Uniformly distributed random numbers. MP/BME 574 Lecture 19: Random number generators and Monte Carlo methods RAND(N) is an N-by-N matrix with random entries, chosen from a uniform distribution on the interval (0.0,1.0). RAND(M,N) and RAND([M,N]) are M-by-N matrices with random entries. RAND(M,N,P,…) or RAND([M,N,P,…]) generate random arrays. RAND with no arguments is a scalar whose value changes each time it is referenced. RAND(SIZE(A)) is the same size as A. RAND produces pseudo-random numbers. The sequence of numbers generated is determined by the state of the generator. Since MATLAB resets the state at start-up, the sequence of numbers generated will be the same unless the state is changed. S = RAND(‘state’) is a 35-element vector containing the current state of the uniform generator. RAND(‘state’,S) resets the state to S. RAND(‘state’,0) resets the generator to its initial state. RAND(‘state’,J), for integer J, resets the generator to its J-th state. RAND(‘state’,sum(100*clock)) resets it to a different state each time. This generator can generate all the floating point numbers in the closed interval [2^(-53), 1-2^(-53)]. Theoretically, it can generate over 2^1492 values before repeating itself. c. hist(g, Nbins), where g is a vector containing pseudorandom numbers and Nbins represents the number of bins to use in grouping counts. freq = hist(g,Nbins) >> g = rand([1024, 1]); >> freq = hist(g,64) II. Random samples from probability distribution functions a. “Monte Carlo” methods MP/BME 574 Lecture 19: Random number generators and Monte Carlo methods b. Gaussian noise example i. Random (or pseudorandom) number generator should result in uniform distribution. ii. Implies all outcomes are equally probable iii. How can we generate random samples of any given pdf? Figure 1: xi rand y ( xi ) f ( xi ) y i c rand y ( xi ) y i [0, c] ? yes, then return( xi ); see case " o" in Figure 1 no, case " x" in Figure 1. c. Samples in each bin of Figure 1 are binomially distributed at each bin centered at xi with width x. n PB ( x j , n; p) p x (1 p) n x x n number of pints falling into interval x x number of successes f ( xi ) p Pr(success) = c d. The mean and the variance will be dependent on the number of trials assuming each bin is equally populated. N n N bins np n f ( xi ) c 2 np(1 p) n f ( xi ) f ( xi ) (1 ) c c n f ( xi ) SNR c 1 f (cx ) i Therefore, we approach or improve our estimate of the function f(xi) with root n dependence. MP/BME 574 Lecture 19: Random number generators and Monte Carlo methods % Zero Mean Gaussian Noise generator % Recall that rand() returns a random number between [0,1] % Scaling in required clear g; mu = 0; bins = 100; sigma = 10; c = 1/(sqrt(2*pi)*sigma).*exp(-(0-mu).^2/(2*sigma^2)); c = c+0.01; index = 256*256*100; range = 60; binwidth = range/bins; for i = 1:index x = range*(rand-0.5); f_x = 1/(sqrt(2*pi)*sigma).*exp(-(x-mu).^2/(2*sigma^2)); r_x = f_x/c; u = rand; if(r_x>u) g(i) = x; else end end g = g(find(g)); counts = length(g); xx = -30:0.01:30; G = counts.*binwidth.*1/(sqrt(2*pi)*sigma).*exp(-(xx-mu).^2/(2*sigma^2)); figure;hist(g,100);hold plot(xx,G,'-k','Linewidth',2) MP/BME 574 Lecture 19: Random number generators and Monte Carlo methods III. Applications of Monte Carlo methods for solving case where a point response function, p(x), is not known or not easily measured. a. Generalized Monte Carlo algorithm: 1. Generate a random number 2. “Guess” within some constraints or boundary on the problem i.e. mapping f(x) to coordinate space (Figure 1). 3. Cost function: “Is the point inside the relevant coordinate space?” 4. If yes, store value 5. Repeat IV. Blind Deconvolution a. In the blind deconvolution problem, both f and h are need to be estimated simultaneously. If nothing is known about either function this is not possible. b. Therefore a strategy is to combine the robust convergence properties of iterative techniques with a priori assumptions about the form of the data including statistical models of uncertainty in the measurements. i. General assumptions about the physical boundary conditions and uncertainty in the data 1. e.g. Non-negativity and compact support. ii. Statistical models of variation in the measured data: 1. e.g. Poisson or Gaussian distributed. 2. This leads to estimates of expected values for the measured data for Maximum Likelihood (ML) optimization. iii. Physical parameter constrains the solution, while the ML approach provides a criterion for evaluating convergence. c. Maximum Likelihood: i. Consider a data estimation problem in which the uncertainty in the measured data are assumed to be governed by a Gaussian probability density function (pdf): Pr( y i ) 1 ( y i yˆ i ) 2 2 2 e 2 2 It is acceptable to evaluate the log-likelihood since log(Pr) is a monotonically increasing function. Therefore, we maximize the total probability by maximizing: 2 1 ( yi yˆ i ) ln Pr ln . 2 2 2 i 2 Therefore, the log likelihood of the measured data is maximized for a model in which, minimized. d. Now consider the iterative ML approach to the blind deconvolution problem: fˆo ( n1, n 2) g ( n1, n 2) fˆ ( n1, n 2) fˆ ( n1, n 2) [ g ( n1, n 2) fˆ ( n1, n 2) * hˆ ( n1, n 2)] k 1 then at each k, k k k ( yi yˆ i ) 2 2 2 is MP/BME 574 Lecture 19: Random number generators and Monte Carlo methods gˆ k (n1, n2) fˆk (n1, n2) * hˆk (n1, n2), and LSE n1 g gˆ 2 k is minimized and used to optimize the convergence. The conditions for n2 convergence are similar to the iterative procedure when h( n1, n 2) is known except that the convergence to the inverse filter is no longer guaranteed and is sensitive to noise, choice of , and the initial guess, fˆo (n1, n2) . e. Statistical model of the convolution process: i. Derives from the ML concept applied to a statistical model of the convolution process. In this approach, x, is a random variable, and hˆ( x ) represents an estimate of a probability density function, h(x ) , that models the missing or unknown data. ii. Intuitively h(x ) is the superposition of multiple random processes used as probes (i.e. individual photons, or molecules of dye) use to measure the response of the system. The physical system must adhere to mass balance and, for finite counting statistics, nonnegativity. iii. For example, consider: ( x ) ( ) p( x )d , where (x ) is our measured image data, ( ) is the desired corrected image, and p( x ) is a conditional probability density function kernel that relates the expected value of the data to the measured data, e.g. p( x ) ( x ) 2 1 2 2 assuming the photon counts in our x2 ray image are approximately Gaussian distributed about there expected value For this example, then ( x ) ( ) 1 2 2 2 e ( x ) 2 e 2 2 d ( x ) * p( x ) becomes our familiar convolution process. f. Expectation maximization i. Expectation in the sense that we use a statistical model of the data measurement process to estimate the missing data ii. Maximization in the maximum likelihood sense, where iteration is used within appropriate physical constraints to maximize the likelihood of the probability model for the image. iii. Consider an “inverse” conditional probability function given by Beyes’ Theorem: Q( x ) ( x ) p( x ) . ( ) Then it is possible to estimate the value of the inverse probability function iteratively from current guesses of k ( x ), k ( ) at the kth iteration of the measured image and deconvolved image respectively. Our iterative estimate of the inverse filter is then: MP/BME 574 Lecture 19: Random number generators and Monte Carlo methods Qk ( x ) k ( x ) p( x ) , k ( ) where, k ( x ) k ( ) p( x )d , and k ( ) ( x )Qk 1 ( x )dx Putting this all together starting with the last result and substituting, then the iterative estimate of the image is: k 1 ( ) ( x ) k ( ) p( x ) ( x) dx k ( ) p( x )dx . k ( x ) k ( x ) This is guaranteed to converge if the x, are non-negative and the respective areas of k (x ) and k ( ) are conserved. This is because k (x ) will approach (x ) and the k 1 ( ) will then approach k ( ) in these circumstances. Note that the model has remained general. As long as the model follows the requirements of a probability density function (pdf), its form can depend on the desired application. This is not to say that the algorithm is guaranteed to converge to the global maximum likelihood result although in practice the algorithm is very robust in applications where there is sufficient SNR.