Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Mathematical optimization wikipedia , lookup
Computer simulation wikipedia , lookup
Numerical weather prediction wikipedia , lookup
General circulation model wikipedia , lookup
History of numerical weather prediction wikipedia , lookup
Multiple-criteria decision analysis wikipedia , lookup
A survey of goodness-of-fit tests for point process models for earthquake occurrences Rick Paik Schoenberg, UCLA Statistics • Some point process models in seismology • Pixel-based methods • Numerical summaries • Error diagrams • Residuals: rescaling, thinning, superposition • Comparative methods, tessellation residuals • Example 1. Some point process models in seismology. Point process: random (s-finite) collection of points in some space, S. N(A) = # of points in the set A. S = [0 , T] x X. Simple: No two points at the same time (with probability one). Conditional intensity: l(t,x) = limDt,Dx -> 0 E{N([t, t+Dt) x Bx,Dx) | Ht} / [DtDx]. Ht = history of N for all times before t, Bx,Dx = ball around x of size Dx. * A simple point process is uniquely characterized by l(t,x). (Fishman & Snyder 1976) Poisson process: l(t,x) doesn’t depend on Ht. N(A1), N(A2), … , N(Ak) are independent for disjoint Ai, and each Poisson. Some cluster models of clustering: a) Neyman-Scott process: clusters of points whose centers are formed from a stationary Poisson process. Typically each cluster consists of a fixed integer k of points which are placed uniformly and independently within a ball of radius r around each cluster’s center. b) Cox-Matern process: cluster sizes are random: independent and identically distributed Poisson random variables. c) Thomas process: cluster sizes are Poisson, and the points in each cluster are distributed independently and isotropically according to a Gaussian distribution. d) Hawkes (self-exciting) process: parents are formed from a stationary Poisson process, and each produces a cluster of offspring points, and each of them produces a cluster of further offspring points, etc. l(t, x) = m + ∑ g(t-ti, ||x-xi||). ti < t Aftershock activity typically follows the modified Omori law (Utsu 1971): g(t) = K/(t+c)p. Commonly used in seismology: • Stationary (homogeneous) Poisson process: l(t,x) = m. • Inhomogeneous Poisson process: l(t, x) = f(t, x). (deterministic) • ETAS (Epidemic-Type Aftershock Sequence, Ogata 1988, 1998): l(t, x) = m(x) + ∑ g(t - ti, ||x - xi||, mi), ti < t where g(t, x, m) = K exp{am} (t+c)p (x2 + d)q 2. Pixel-based methods. Compare N(Ai) with ∫A l(t, x) dt dx, on pixels Ai. (Baddeley, Turner, Møller, Hazelton, 2005) Problems: * If pixels are large, lose power. * If pixels are small, residuals are mostly ~ 0,1. * Smoothing reveals only gross features. (Baddeley, Turner, Møller, Hazelton, 2005) 3. Numerical summaries. a) Likelihood statistics (LR, AIC, BIC). Log-likelihood = ∑ logl(ti,xi) - ∫ l(t,x) dt dx. b) Second-order statistics. * K-function, L-function (Ripley, 1977) * Weighted K-function (Baddeley, Møller and Waagepetersen 2002, Veen and Schoenberg 2005) * Other weighted 2nd-order statistics: R/S statistic, correlation integral, fractal dimension (Adelfio and Schoenberg, 2009) QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture. QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture. QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture. Weighted K-function Usual K-function: K(h) ~ ∑∑i≠j I(|xi - xj| ≤ h), Weight each pair of points according to the estimated intensity at the points: Kw(h) ~ ∑∑i≠j wi wj I(|xi - xj| ≤ h), ^ where wi = l(ti , xi)-1. (asympt. normal, under certain regularity conditions.) Lw(h) = centered version = √[Kw(h)/π] - h, for R2 Model: l(x,y;a) = a m(x,y) + (1- a)n. h (km) 3. Numerical summaries. a) Likelihood statistics (LR, AIC, BIC). Log-likelihood = ∑ logl(ti,xi) - ∫ l(t,x) dt dx. b) Second-order statistics. * K-function, L-function (Ripley, 1977) * Weighted K-function (Baddeley, Møller and Waagepetersen 2002, Veen and Schoenberg 2005) * Other weighted 2nd-order statistics: R/S statistic, correlation integral, fractal dimension (Adelfio and Schoenberg, 2009) c) Other test statistics (mostly vs. stationary Poisson). TTT, Khamaladze (Andersen et al. 1993) Cramèr-von Mises, K-S test (Heinrich 1991) Higher moment and spectral tests (Davies 1977) Problems: -- Overly simplistic. -- Stationary Poisson not a good null hypothesis (Stark 1997) 4. Error Diagrams Plot (normalized) number of alarms vs. (normalized) number of false negatives (failures to predict). (Molchan 1990; Molchan 1997; Zaliapin & Molchan 2004; Kagan 2009). Similar to ROC curves (Swets 1973). Problems: -- Must focus near axes. [consider relative to given model (Kagan 2009)] -- Does not suggest where model fits poorly. 5. Residuals: rescaling, thinning, superposing Rescaling. (Meyer 1971; Merzbach & Nualart 1986; Nair 1990; Schoenberg 1999; Vere-Jones and Schoenberg 2004): Suppose N is simple. Rescale one coordinate: move each point {ti, xi} to {ti , ∫oxi l(ti,x) dx} [or to {∫oti l(t,xi) dt), xi }]. Then the resulting process is stationary Poisson. Problems: * Irregular boundary, plotting. * Points in transformed space hard to interpret. * For highly clustered processes: boundary effects, loss of power. Thinning. (Westcott 1976): Suppose N is simple, stationary, & ergodic. Thinning: Suppose inf l(ti ,xi) = b. Keep each point (ti ,xi) with probability b / l(ti ,xi) . Can repeat many times --> many stationary Poisson processes (but not quite ind.!) Superposition. (Palm 1943): Suppose N is simple & stationary. Then Mk --> stationary Poisson. Superposition: Suppose sup l(t , x) = c. Superpose N with a simulated Poisson process of rate c - l(t , x) . As with thinning, can repeat many times to generate many (non-independent) stationary Poisson processes. Problems with thinning and superposition: Thinning: Low power. If b = inf l(ti ,xi) is small, will end up with very few points. Superposition: Low power if c = sup l(ti ,xi) is large: most of the residual points will be simulated. 6. Comparative methods, tessellation. -- Can consider difference (between competing models) between residuals over each pixel. Problem: Hard to interpret. If difference = 3, is this because model A overestimated by 3? Or because model B underestimated by 3? Or because model A overestimated by 1 and model B underestimated by 2? Also, when aggregating over pixels, it is possible that a model will predict the correct number of earthquakes, but at the wrong locations and times. -- Better: consider difference between log-likelihoods, in each pixel. (Wong & Schoenberg 2009). Problem: pixel choice is arbitrary, and unequal # of pts per pixel….. Problem: pixel choice is arbitrary, and unequal # of pts per pixel….. -- Alternative: use the Voronoi tessellation of the points as cells. Cell i = {All locations closer to point (xi,yi) than to any other point (xj,yj) }. Now 1 point per cell. If l is locally constant, then cell area ~ Gamma (Hinde and Miles 1980) 8. Example: using focal mechanisms in ETAS In ETAS (Ogata 1998), l(t,x,m) = f(m)[m(x) + ∑i g(t-ti, x-xi, mi)], where f(m) is exponential, m(x) is estimated by kernel smoothing, and i.e. the spatial triggering component, in polar coordinates, has the form: g(r, q) = (r2 + d)q . Looking at inter-event distances in Southern California, as a function of the direction qi of the principal axis of the prior event, suggests: g(r, q; qi) = g1(r) g2(q - qi | r), where g1 is the tapered Pareto distribution, and g2 is the wrapped exponential. ETAS: no use of focal mechanisms. Summary of principal direction of motion in an earthquake, as well as resulting stress changes and tension/pressure axes. tapered Pareto / wrapped exp. biv. normal (Ogata 1998) Cauchy/ ellipsoidal (Kagan 1996) Thinned residuals: Data Cauchy/ ellipsoidal (Kagan 1996) tapered Pareto / wrapped exp. biv. normal (Ogata 1998) Tapered pareto / wrapped exp. Cauchy / ellipsoidal