Download Bayesian Analysis for Extreme Events

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Statistics wikipedia , lookup

History of statistics wikipedia , lookup

Foundations of statistics wikipedia , lookup

Transcript
Bayesian Analysis for Extreme Events
Pao-Shin Chu and Xin Zhao
Department of Meteorology
School of Ocean & Earth Science &
Technology University of HawaiiManoa
Why Bayesian inference?
• A rigorous way to make probability statements
about the parameters of interest.
• An ability to update these statements as new
information is received.
• Recognition that parameters are changing over time
rather than forever fixed.
• An efficient way to provide a coherent and
rational framework for reducing
uncertainties by incorporating diverse
information sources (e.g., subjective
beliefs, historical records, model
simulations). An example: annual rates of
US hurricanes (Elsner and Bossak, 2002)
• Uncertainty modeling and learning from
data (Berliner, 2003)
Some applications of Bayesian analysis
for climate research
•
Change-point analysis for extreme events (e.g.,
tropical cyclones, heavy rainfall, summer heat
waves)
Why change-point analysis?
•
Tropical cyclone prediction (Chu and Zhao, 2007,
J. Climate; Lu, Chu, and Chen, 2010, Weather &
Forecasting, accepted)
•
Clustering of typhoon tracks in the WNP (Chu et al., 2010,
Regional typhoon activity as revealed by track patterns and climate
change, in Hurricanes and Climate Change, Elsner et al., Eds.,
Springer, in press)
Other Examples
• Predicting climate
variations (e.g.,
ENSO)
• Quantifying
uncertainties in
projections of future
climate change
Bayes’ theorem
θ : parameter
Classical statistics: θ a constant
Bayesian inference: θ a random quantity, P(θ|y)
y: data
P(y|θ): likelihood function
π(θ): prior probability distribution
P (θ | y ) 
P ( y | θ ) ( θ )
 P ( y | θ ) ( θ ) d θ
θ
Change-point analysis for tropical
cyclones
• Given the Poisson intensity parameter (i.e., the
mean seasonal TC rates), the probability mass
function (PMF) of tropical cyclones occurring in
T years is
h
(T )
P(h |  , T )  exp( T )
h!
• where h  0,1,2,... and   0 , T  0 . The λ is
regarded as a random variable, not a
constant.
• Gamma density is known as a conjugate prior
and posterior for λ. A functional choice for λ is
a gamma distribution
T 'h '  h '1
f ( | h ', T ') 
exp(T ')
(h ')
• where λ>0, h´ >0, T´>0. h´ and T´ are prior
parameters.
• The PDF of h tropical cyclones in T years
when the Poisson intensity is codified as a
gamma density with prior parameters T’
and h’ is a negative binomial
distribution (Epstein, 1985)

P(h | h' , T ' , T )   P(h |  , T ) f ( | h' , T ' )d
0
( h  h ')  T '   T 


 

( h ') h !  T  T '   T  T ' 
h'
h  0,1,... h' 0 T '  0 T  0
h
A hierarchical Bayesian tropical
cyclone model
h'
T'
i
hi
adapted from Elsner and Jagger (2004)
i  1,2,...,n
Hypothesis model for change-point analysis
(Consider 3 hypo.)
H0
H1
H2
(1) Hypothesis H 0 : “A no change of the rate” of the
typhoon series:
hi ~ Poisson (hi | 0 , T ) , i  1,2,..., n .
0 ~ gamma(h0 ' , T0 ' ) where the prior knowledge of
parameters h0 ' and T0 ' is given. T = 1.
(2)
Hypothesis
series:
H1 :
the
“A single change of the rate” of the typhoon
hi ~ Poisson(hi | 11 ,T ) ,
when
hi ~ Poisson (hi | 12 ,T ) , when
  2,3,..., n , and
i  1,2,...,  1
i   ,..., n
11 ~ gamma(h11' ,T11' )
12 ~ gamma(h12' ,T12' )
where the prior knowledge of the parameters h11' , T11' , h12' , T12' is
given. There are two epochs in this model and  is defined as
the first year of the second epoch, or the change-point.
Markov Chain Monte Carlo (MCMC) approach
• Standard Monte Carlo methods produce a set of
independent simulated values according to some
probability distribution.
• MCMC methods produce chains in which each of the
simulated values is mildly dependent on the preceding
value. The basic principle is that once this chain has run
sufficiently long enough it will find its way to the desired
posterior distribution.
• One of the most widely used MCMC algorithms is the
Gibbs sampler for producing chain values. The idea is
that if it is possible to express each of the coefficients to
be estimated as conditioned on all of the others, then by
cycling through these conditional statements, we can
eventually reach the true joint distribution of interest.
Θ = [θ1,θ2,…θp]
• Gibbs Sampler (We can generate a value
from the conditional distribution for one
part of the θ given the values of the rest of
other parts of θ; it involves successive
drawing from conditional posterior
densities P(θk |h, θ1,…,θk-1,θk+1,…,θp) for k
from 1 to p)
Bayesian inference under each hypothesis
(1) Bayesian inference underH
hypothesis
0
There is only one parameter under this hypothesis.
Since gamma is the conjugate prior for Poisson, the
conditional posterior density function for is:
0
0
n
0 | h, H 0 ~ gamma( h0 '  hi , T0 ' n)
i 1
(2)(2)
Bayesian
inference
Bayesian
inferenceunder
underH H hypothesis
hypothesis
1
1
Under
Underthis
thishypothesis,
hypothesis,there
thereare
are33parameters,
parameters, ,,
 . .
andand
1111
12
12
 1 1
H hypothesis
(2)
Bayesian
inference
under

|
h
,

,
H
~
gamma
(
h
'

h
,
T
'



1
11 | 11h, , H 1 ~1 gamma (h11 '11 hi , Ti 11 11'  1) )
1
i 1
i 1

Under this hypothesis,
there
are
3
parameters,
n
n
.
12 | h, ,and
H 1 ~ gamma
( h12 '  hi , T12 ' n    1)
12 | h, , H 1 ~ gamma (h12 ' i hi , T12 ' n 1  1)
i 
11 | h, , H 1 ~ gamma
( h11 '  hi , T11 '  1)
11
i 1
 1
 1 h

P( | h, H 1 , 11 , 12 ) e( 1)(   ) (n 11 
) hi
11 12
P( | h, H 1 , 1211| h, ,12, H
)
( 1112h)i ,i T1 12 ' n    1)
1 ~e gamma ( h12 '
( 1)( 11 12 )
i
i 1
i 12
 1
P( | h, H 1 , 11 , 12 )  e
( 1)( 11 12 )
11  h
i
(
)
i 1
With the prior knowledge, we can apply the Gibbs sampler to draw
samples from the posterior distribution of the model parameters under
each respective hypothesis.
Hypothesis Analysis
θ  [0 , 11, 12 , 21, 22 , 23 , , 1 , 2 ]'
P ( H | h) 
P(h | H ) P( H )
 P(h | H ) P( H )
H
• Under uniform prior assumption for hypothesis space
P ( H | h)  P (h | H )
P(h | H )   P(h | θ, H ) P(θ | H )dθ
θ
1
P(h | H ) 
N
N
[i ]
P
(
h
|
θ
,
H
)

i 1
θ ~ P (θ | H )
N 
•
•
•
•
•
Annual major hurricane count series
for the ENP
P(H2 |h) = 0.784
P(H1|h) = 0.195
P(H0|h) = 0.021
= 1982 and 2 = 1999, 3 epochs
1


Why RJMCMC?
• Because parameter spaces within different
hypotheses are typically different from
each other, a simulation has to be run
independently for each of the candidate
hypotheses.
• If the hypotheses have large dimension,
the MCMC approach is not efficient.
• Green (1995)
Reversible jump sampling for moving
between spaces of differing dimensions
• A trans-dimensional Markov chain
simulation in which the dimension of the
parameter space can change from one
iteration to the next
• Useful for model or hypothesis selection
problems
4 different gamma models (4 epochs with 3
change-points)
151
301
401
4 different gamma models (3 changepoints)
Prior specification
 With time series h=[h1 , h2 , ... ,hn ]', we run L independent
iterations. Within the j -th iteration, 1  j  L, we randomly
pick two different points from 1 to n, say, k0 and k1 (k0 < k1 ).
Then we calculate the sample mean of this batch of samples
{hi , k0  i  k1}, obtaining a realization of the Poisson rate
of this iteration, 
[ j]
 1/ ( k1  k0  1)  i  k hi .
k1
0
In the end, we obtain a set of samples, { [ j ] , 1  j  L}.

T '  m / s2 and h '  m * T '
L
1 L [ j]
1
where m    and s2 
( [ j ]  m ) 2 .

L j 1
L  1 j 1
Extreme rainfall events in Hawaii
Summary
• Why Bayesian analysis
• Applications for climate research (extreme events
and climate change)
• Change-point analysis
Mathematical model of rare event count series
Hypothesis model
Bayesian inference under each hypothesis
Major hurricane series in the eastern North Pacific
• Recent Advance (RJMCMC)