Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
A hierarchical model for extremes of ozone, NO and NO2 Emma Eastoe Department of Mathematics and Statistics Lancaster University, UK September 2009 Outline 1 Aims Data Questions of interest 2 Background - univariate extremes Stationary With covariates 3 Hierarchical model The model Simulation Inference 4 Results Margins Joint distribution of NO and NO2 5 Comments Emma Eastoe (Lancaster University) Hierarchical model for extremes September 2009 2 / 21 The data Daily maxima of hourly ozone, NO and NO2 concentrations at a site in central Reading, UK NO and NO2 are primary pollutants e.g. released during combustion of fossil fuels Ozone is a secondary pollutant, i.e. formed in the atmosphere NO and NO2 are precursors for ozone Temperature, duration/intensity of sunlight (photochemical reaction) and wind speed (dispersion) are also important factors in determining concentrations Emma Eastoe (Lancaster University) Hierarchical model for extremes September 2009 3 / 21 2001 150 NO2 100 √ √ 8 NO2 8 10 1998 1999 2000 Time 2001 O3 10 12 14 2001 O3 10 12 14 1999 2000 Time 0 5 10 15 20 25 √ NO Emma Eastoe (Lancaster University) 6 4 4 4 6 6 √ 1998 8 1999 2000 Time 12 1998 0 0 50 200 50 NO 400 O3 100 150 600 200 800 Data in pictures 0 5 10 15 20 25 √ NO Hierarchical model for extremes 4 6 8 10 √ NO2 12 September 2009 4 / 21 Notation Y : d-dimensional random variable, generally observe a sequence Yi of length n X : associated n × p matrix of covariates (may be fixed or time-varying) Emma Eastoe (Lancaster University) Hierarchical model for extremes September 2009 5 / 21 General questions for MV extremes The (univariate) marginal probability that one of the components is large The probability that all components of the vector are simultaneously large Conditional on a subset A of the components being large, the probability distribution of the d − |A| remaining components The probability that some function of (a subset of) the components is large. Generally, ‘is large’ means exceeds some high level. Emma Eastoe (Lancaster University) Hierarchical model for extremes September 2009 6 / 21 Stationary univariate extremes Stationary sequence of random variables {Yt } Asymptotically motivated peaks over threshold model (e.g. Davison and Smith, 1990) For high enough level u model the size and rate of (local maxima of) threshold exceedances Size is modelled by a generalised Pareto distribution (GPD), for y > 0, ξy −1/ξ Pr[Yt > u + y |Yt > u] = 1 + , ψu + where ψu > 0 is threshold dependent. Model the rate φu = Pr[Yt > u] as a Bernoulli random variable. Emma Eastoe (Lancaster University) Hierarchical model for extremes September 2009 7 / 21 Univariate extremes with covariates Model (functions of the) parameters as functions of covariates, xt We consider linear functions only log ψu (xt ) = ψ ′ xt , ξ(xt ) = ξ ′ xt and logitφu (xt ) = φ′ xt where ψ, ξ and φ are vectors of regression coefficients Inference straightforward through maximum likelihood, which can be split into two parts - one for (ψ, ξ) and one for φ Other recent work has looked at modelling covariates using additive (Chavez-Demoulin and Davison, 2005) or local likelihood (Davison and Ramesh, 2000 and Hall and Tajvidi, 2000) models. Emma Eastoe (Lancaster University) Hierarchical model for extremes September 2009 8 / 21 Preprocessing First model aspects of the underlying process and use these to preprocess the data, producing an approximately stationary sequence In the absence of a better model, we suggest the following λ(xt ) Yt −1 = µ(xt ) + σ(xt )Zt λ(xt ) where σ(xt ) > 0 and Zt is a assumed to be stationary Model the extremes of Zt using a threshold uz and the GPD(ψu (xt ), ξ(xt )) and rate φu (xt ) model Functions of location, scale and Box-Cox parameters modelled as linear functions of covariates (see Eastoe and Tawn, 2009) Emma Eastoe (Lancaster University) Hierarchical model for extremes September 2009 9 / 21 Model description Suppose we can order the d-dimensional random variable in some meaningful way Yt = (Y1t , . . . , Ydt ) Then model Yit conditionally on Covariates Xt , and The set Si = {Yjt : j < i}, which is empty for i = 1 In our example, we model NO conditional on covariates, NO2 conditional on NO and covariates and ozone conditional on both NO, NO2 and covariates Emma Eastoe (Lancaster University) Hierarchical model for extremes September 2009 10 / 21 Model description 2 At each level we model the sequence {Yit } using the pre-processing approach, allowing a model on both tails of each component: 1 Estimate the parameters µi (xt , Sit ), σi (xt , Sit ) and λi (xt , Sit ) 2 Transform Yit to Zit 3 4 Obtain upper and lower thresholds of Zit , denoted uil (lower) and uiu (upper) Estimate the GPD and rate parameters using the data above (below) the thresholds uiu (uil ) : φli (xt , Sit ), φui (xt , Sit ), etc. Emma Eastoe (Lancaster University) Hierarchical model for extremes September 2009 11 / 21 Simulation To estimate extreme probabilities, we use simulation Denote simulated data by (Y∗ , X∗ ) Since daily data, simulation of N-years of data requires L = 365N observations To simulate covariates for day t ∈ {1, . . . , 365} in any given year, since we have no model, resample across the observed years from values observed on day t only. Emma Eastoe (Lancaster University) Hierarchical model for extremes September 2009 12 / 21 Simulation 2 Simulate Yt in the order Y1t , . . . , Ydt . For each i , conditional on ∗ , . . . , Y∗ ∗ (Y1t (i −1)t , Xt ), 1 Simulate a Ut ∼ Uniform(0, 1) 2 If Ut > φui (x∗t , Sit∗ ), then simulate Zit∗ from the upper tail model 3 If Ut < φli (x∗t , Sit∗ ), then simulate Zit∗ from the lower tail model 4 5 If φli (x∗t , Sit∗ ) ≤ Ut ≤ φui (x∗t , Sit∗ ), then resample Zit∗ from {Zit : uil ≤ Zit ≤ uiu } Back transform to the original scale Emma Eastoe (Lancaster University) Hierarchical model for extremes September 2009 13 / 21 Inference Adopt a Bayesian framework : posterior distributions, better able to quantify uncertainty Requires MCMC : Gibbs updates for location and rate parameters, Metropolis-Hastings random walk for the rest Emma Eastoe (Lancaster University) Hierarchical model for extremes September 2009 14 / 21 4 2 0 0 0 5 5 Wind 10 15 20 Temp 10 15 20 25 30 Sun 6 8 10 12 14 Covariates Xt 1998 1999 2000 Time 2001 1998 1999 2000 Time 2001 1998 1999 2000 Time 2001 Also include first order interactions and indicators for Weekend Year Season Emma Eastoe (Lancaster University) Hierarchical model for extremes September 2009 15 / 21 Preprocessing results 50 Z3t -4 -3 -2 -1 0 1 Y3t 100 150 200 2 3 Recalling that Y1t is NO, Y2t is NO2 and Y3t is ozone, and using 90% thresholds: 2 3 1998 1999 2000 2001 Time Z3t -4 -3 -2 -1 0 1 Z3t -4 -3 -2 -1 0 1 2 3 1998 1999 2000 2001 Time -3 -2 -1 0 1 2 3 4 Z1t Emma Eastoe (Lancaster University) -4 Hierarchical model for extremes -2 0 Z2t 2 4 September 2009 16 / 21 95% 99% CR 0 200 400 Empirical 600 800 0 95% 99% CR 0 0 0 50 500 Model 1000 95% 99% CR Model 50 100 150 200 250 300 Model 100 150 200 250 1500 Goodness of fit 50 100 Empirical 150 50 100 150 Empirical 200 QQ plots with 95% (99%) credibility regions. Simulated data sets of the same length as the observed data for each of the samples from the posterior. These were ordered and the median (quantiles) found. Emma Eastoe (Lancaster University) Hierarchical model for extremes September 2009 17 / 21 Prediction Estimated 5-, 10- and 100-year return levels, with 95% credibility regions. NO NO2 O3 5-year 706 (596,910) 156 (146,175) 218 (197,247) 10-year 819 (670,1143) 167 (152,194) 234 (208,271) 100-year 1346 (914,3381) 210 (174,364) 289 (239,387) 5-year levels exceeded only 2 or 3 times for each series Only 10-year level for NO2 exceeded (once) None of 100-year levels exceeded Emma Eastoe (Lancaster University) Hierarchical model for extremes September 2009 18 / 21 Conditional joint distribution 250 Conditional on ozone achieving it’s 10-year marginal return level. This condition deceases the probability that both NO and NO2 exceed their marginal threshold level, but increases the probability that at least one does. 0 50 100 NO2 150 200 E1 0 Emma Eastoe (Lancaster University) 100 200 300 NO 400 Hierarchical model for extremes 500 September 2009 19 / 21 Comments Presented an approach for the prediction of the probabilities of joint and marginal extreme events for a multivariate, non-stationary data set Proposed method is a hierarchical extension to the pre-processing method previously described in the univariate case Drawback - relies on the assumption that we can extrapolate the response-covariate regression model Method also relies on finding a sensible ‘hierarchy’ Future work - how well does the model capture levels of asymptotic (in)dependence? Emma Eastoe (Lancaster University) Hierarchical model for extremes September 2009 20 / 21 References Chavez-Demoulin, V. and Davison, A.C. Generalized additive modelling of sample extremes. Appl. Statist. 54, 207-222. Davison, A.C. and Smith, R.L. (1990) Models for exceedances over high Thresholds, with discussion. J. Roy. Statist. Soc. B., 52, 393-442. Davison, A.C. and Ramesh, N.I. (2000) Local likelihood smoothing of extremes. J. R. Statist. Soc. B, 62, 191-208. Eastoe, E.F. (2008) A hierarchical model for non-stationary multivariate extremes: a case study of surface-level ozone and NOX data in the UK. Environmetrics, 20:428-444. Eastoe, E.F. and Tawn, J.A. (2009) Modelling non-stationary extremes with application to surface-level ozone. Appl. Statist. 58, 25-45. Hall, P. and Tajvidi, N. (2000) Nonparametric analysis of temporal trend when fitting parametric models to extreme-value data. Statist. Sci., 2, 153-167. Emma Eastoe (Lancaster University) Hierarchical model for extremes September 2009 21 / 21