Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Bayesian Analysis for Extreme Events Pao-Shin Chu and Xin Zhao Department of Meteorology School of Ocean & Earth Science & Technology University of HawaiiManoa Why Bayesian inference? • A rigorous way to make probability statements about the parameters of interest. • An ability to update these statements as new information is received. • Recognition that parameters are changing over time rather than forever fixed. • An efficient way to provide a coherent and rational framework for reducing uncertainties by incorporating diverse information sources (e.g., subjective beliefs, historical records, model simulations). An example: annual rates of US hurricanes (Elsner and Bossak, 2002) • Uncertainty modeling and learning from data (Berliner, 2003) Some applications of Bayesian analysis for climate research • Change-point analysis for extreme events (e.g., tropical cyclones, heavy rainfall, summer heat waves) Why change-point analysis? • Tropical cyclone prediction (Chu and Zhao, 2007, J. Climate; Lu, Chu, and Chen, 2010, Weather & Forecasting, accepted) • Clustering of typhoon tracks in the WNP (Chu et al., 2010, Regional typhoon activity as revealed by track patterns and climate change, in Hurricanes and Climate Change, Elsner et al., Eds., Springer, in press) Other Examples • Predicting climate variations (e.g., ENSO) • Quantifying uncertainties in projections of future climate change Bayes’ theorem θ : parameter Classical statistics: θ a constant Bayesian inference: θ a random quantity, P(θ|y) y: data P(y|θ): likelihood function π(θ): prior probability distribution P (θ | y ) P ( y | θ ) ( θ ) P ( y | θ ) ( θ ) d θ θ Change-point analysis for tropical cyclones • Given the Poisson intensity parameter (i.e., the mean seasonal TC rates), the probability mass function (PMF) of tropical cyclones occurring in T years is h (T ) P(h | , T ) exp( T ) h! • where h 0,1,2,... and 0 , T 0 . The λ is regarded as a random variable, not a constant. • Gamma density is known as a conjugate prior and posterior for λ. A functional choice for λ is a gamma distribution T 'h ' h '1 f ( | h ', T ') exp(T ') (h ') • where λ>0, h´ >0, T´>0. h´ and T´ are prior parameters. • The PDF of h tropical cyclones in T years when the Poisson intensity is codified as a gamma density with prior parameters T’ and h’ is a negative binomial distribution (Epstein, 1985) P(h | h' , T ' , T ) P(h | , T ) f ( | h' , T ' )d 0 ( h h ') T ' T ( h ') h ! T T ' T T ' h' h 0,1,... h' 0 T ' 0 T 0 h A hierarchical Bayesian tropical cyclone model h' T' i hi adapted from Elsner and Jagger (2004) i 1,2,...,n Hypothesis model for change-point analysis (Consider 3 hypo.) H0 H1 H2 (1) Hypothesis H 0 : “A no change of the rate” of the typhoon series: hi ~ Poisson (hi | 0 , T ) , i 1,2,..., n . 0 ~ gamma(h0 ' , T0 ' ) where the prior knowledge of parameters h0 ' and T0 ' is given. T = 1. (2) Hypothesis series: H1 : the “A single change of the rate” of the typhoon hi ~ Poisson(hi | 11 ,T ) , when hi ~ Poisson (hi | 12 ,T ) , when 2,3,..., n , and i 1,2,..., 1 i ,..., n 11 ~ gamma(h11' ,T11' ) 12 ~ gamma(h12' ,T12' ) where the prior knowledge of the parameters h11' , T11' , h12' , T12' is given. There are two epochs in this model and is defined as the first year of the second epoch, or the change-point. Markov Chain Monte Carlo (MCMC) approach • Standard Monte Carlo methods produce a set of independent simulated values according to some probability distribution. • MCMC methods produce chains in which each of the simulated values is mildly dependent on the preceding value. The basic principle is that once this chain has run sufficiently long enough it will find its way to the desired posterior distribution. • One of the most widely used MCMC algorithms is the Gibbs sampler for producing chain values. The idea is that if it is possible to express each of the coefficients to be estimated as conditioned on all of the others, then by cycling through these conditional statements, we can eventually reach the true joint distribution of interest. Θ = [θ1,θ2,…θp] • Gibbs Sampler (We can generate a value from the conditional distribution for one part of the θ given the values of the rest of other parts of θ; it involves successive drawing from conditional posterior densities P(θk |h, θ1,…,θk-1,θk+1,…,θp) for k from 1 to p) Bayesian inference under each hypothesis (1) Bayesian inference underH hypothesis 0 There is only one parameter under this hypothesis. Since gamma is the conjugate prior for Poisson, the conditional posterior density function for is: 0 0 n 0 | h, H 0 ~ gamma( h0 ' hi , T0 ' n) i 1 (2)(2) Bayesian inference Bayesian inferenceunder underH H hypothesis hypothesis 1 1 Under Underthis thishypothesis, hypothesis,there thereare are33parameters, parameters, ,, . . andand 1111 12 12 1 1 H hypothesis (2) Bayesian inference under | h , , H ~ gamma ( h ' h , T ' 1 11 | 11h, , H 1 ~1 gamma (h11 '11 hi , Ti 11 11' 1) ) 1 i 1 i 1 Under this hypothesis, there are 3 parameters, n n . 12 | h, ,and H 1 ~ gamma ( h12 ' hi , T12 ' n 1) 12 | h, , H 1 ~ gamma (h12 ' i hi , T12 ' n 1 1) i 11 | h, , H 1 ~ gamma ( h11 ' hi , T11 ' 1) 11 i 1 1 1 h P( | h, H 1 , 11 , 12 ) e( 1)( ) (n 11 ) hi 11 12 P( | h, H 1 , 1211| h, ,12, H ) ( 1112h)i ,i T1 12 ' n 1) 1 ~e gamma ( h12 ' ( 1)( 11 12 ) i i 1 i 12 1 P( | h, H 1 , 11 , 12 ) e ( 1)( 11 12 ) 11 h i ( ) i 1 With the prior knowledge, we can apply the Gibbs sampler to draw samples from the posterior distribution of the model parameters under each respective hypothesis. Hypothesis Analysis θ [0 , 11, 12 , 21, 22 , 23 , , 1 , 2 ]' P ( H | h) P(h | H ) P( H ) P(h | H ) P( H ) H • Under uniform prior assumption for hypothesis space P ( H | h) P (h | H ) P(h | H ) P(h | θ, H ) P(θ | H )dθ θ 1 P(h | H ) N N [i ] P ( h | θ , H ) i 1 θ ~ P (θ | H ) N • • • • • Annual major hurricane count series for the ENP P(H2 |h) = 0.784 P(H1|h) = 0.195 P(H0|h) = 0.021 = 1982 and 2 = 1999, 3 epochs 1 Why RJMCMC? • Because parameter spaces within different hypotheses are typically different from each other, a simulation has to be run independently for each of the candidate hypotheses. • If the hypotheses have large dimension, the MCMC approach is not efficient. • Green (1995) Reversible jump sampling for moving between spaces of differing dimensions • A trans-dimensional Markov chain simulation in which the dimension of the parameter space can change from one iteration to the next • Useful for model or hypothesis selection problems 4 different gamma models (4 epochs with 3 change-points) 151 301 401 4 different gamma models (3 changepoints) Prior specification With time series h=[h1 , h2 , ... ,hn ]', we run L independent iterations. Within the j -th iteration, 1 j L, we randomly pick two different points from 1 to n, say, k0 and k1 (k0 < k1 ). Then we calculate the sample mean of this batch of samples {hi , k0 i k1}, obtaining a realization of the Poisson rate of this iteration, [ j] 1/ ( k1 k0 1) i k hi . k1 0 In the end, we obtain a set of samples, { [ j ] , 1 j L}. T ' m / s2 and h ' m * T ' L 1 L [ j] 1 where m and s2 ( [ j ] m ) 2 . L j 1 L 1 j 1 Extreme rainfall events in Hawaii Summary • Why Bayesian analysis • Applications for climate research (extreme events and climate change) • Change-point analysis Mathematical model of rare event count series Hypothesis model Bayesian inference under each hypothesis Major hurricane series in the eastern North Pacific • Recent Advance (RJMCMC)