Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Statistical Methods for Alerting Algorithms in Biosurveillance Howard S. Burkom The Johns Hopkins University Applied Physics Laboratory National Security Technology Department Washington Statistical Society Seminar February 3, 2006 National Center for Health Statistics Hyattsville, MD ESSENCE Biosurveillance Systems • ESSENCE: An Electronic Surveillance System for the Early Notification of Community-based Epidemics • Monitoring health care data from ~800 military treatment facilities since Sept. 2001 • Evaluating data sources – Civilian physician visits – OTC pharmacy sales – Prescription sales – Nurse hotline/EMS data – Absentee rate data • Developing & implementing alerting algorithms Outline of Talk • Prospective Syndromic Surveillance: introduction, challenges • Algorithm Evaluation Approaches • Statistical Quality Control in Health Surveillance • Data Modeling and Process Control • Regression Modeling Approach • Generalized Exponential Smoothing • Comparison Study • Summary & Research Directions Required Disciplines: Medical/Epi Medical/Epidemiological • filtering/classifying clinical records => syndromes • interpretation/response to system output • coding/chief complaint interpretation Required Disciplines: Informatics Information Technology • surveillance system architecture • data ingestion/cleaning • interface between health monitors and system Required Disciplines: Analytics Analytical • Statistical hypothesis tests • Data mining/automated learning • Adaptation of methodology to background data behavior Essential Task Interaction in Volatile Data Background Medical/Epidemiological • filtering/classifying clinical records => syndromes • interpretation/response to system output • coding/chief complaint interpretation Information Technology • surveillance system architecture • data ingestion/cleaning • interface between health monitors and system Analytical • Statistical hypothesis tests • Data mining/automated learning • Adaptation of methodology to background data behavior The Multivariate Temporal Surveillance Problem Varying Nature of the Data: Multivariate Nature of Problem: • • • • Scale, trend, day-of-week, seasonal behavior depending on grouping: Many locations Multiple syndromes Stratification by age, gender, other covariates Surveillance Challenges: • Defining anomalous behavior(s) – Hypothesis tests--both appropriate and timely • Avoiding excessive alerting due to multiple testing – Correlation among data streams – Varying noise backgrounds • Communication with/among users at different levels • Data reduction and visualization Data issues affecting monitoring Most suitable for – Statistical properties modeling without • Scale and random dispersion data-specific – Periodic effects information • Day-of-week effects, seasonality – Delayed (often variably) availability in monitoring system – Trends: long/short term: many causes, incl. changes in: • Population distribution or demographic composition • Data provider participation • Consumer health care behavior • Coding or billing practices – Prolonged data drop-outs, sometimes with catch-ups – Outliers unrelated to infectious disease levels • Often due to problems in data chain • Inclement weather • Media reports (example: the “Clinton effect”) Forming the Outcome Variable: Binning by Diagnosis Code Rash Syndrome Grouping of Diagnosis Codes www.bt.cdc.gov/surveillance/syndromedef/word/syndromedefinitions.doc Rash ICD-9-CM Code List ICD9CM 050.0 050.1 050.2 050.9 051.0 051.1 052.7 052.8 052.9 057.8 057.9 695.0 695.1 695.2 695.89 695.9 ICD9DESCR SMALL POX, VARIOLA MAJOR SMALL POX, ALASTRIM SMALL POX, MODIFIED SMALLPOX NOS COWPOX PSEUDOCOWPOX VARICELLA COMPLICAT NEC VARICELLA W/UNSPECIFIED C VARICELLA NOS EXANTHEMATA VIRAL OTHER S EXANTHEM VIRAL, UNSPECIFI ERYTHEMA TOXIC ERYTHEMA MULTIFORME ERYTHEMA NODOSUM ERYTHEMATOUS CONDITIONS O ERYTHEMATOUS CONDITION N 692.9 782.1 DERMATITIS UNSPECIFIED CA RASH/OTHER NONSPEC SKIN E 2 2 026.0 026.1 026.9 051.2 051.9 053.20 SPIRILLARY FEVER STREPTOBACILLARY FEVER RAT-BITE FEVER UNSPECIFIED DERMATITIS PUSTULAR, CONT PARAVACCINIA NOS HERPES ZOSTER DERMATITIS E HERPES ZOSTER WITH OTHER SPECIF COMPLIC H.Z. W/ UNSPEC. COMPLICATION HERPES ZOSTER NOS W/O COM ECZEMA HERPETICUM HERPES SIMPLEX W/OTH.SPEC HERPES SIMPLEX, W/UNS.COM 3 3 3 3 3 3 053.79 053.8 053.9 054.0 054.79 054.8 Consensus 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 3 3 3 3 3 3 Chief Complaint Query Simulated Data Dynamic Detection Dynamic Detection Simulated Data Example with Detection Statistic Plot Injected Cases Presumed Attributable to Outbreak Event Threshold Comparing Alerting Algorithms Criteria: • Sensitivity – Probability of detecting an outbreak signal – Depends on effect of outbreak in data • Specificity ( 1 – false alert rate ) – Probability(no alert | no outbreak ) – May be difficult to prove no outbreak exists • Timeliness – Once the effects of an outbreak appear in the data, how soon is an alert expected? Modeling the Signal as Epicurve of Primary Cases Observed vs Modeled Incubation Period Distribution: Sverdlovsk 1979 Outbreak 12 Number of Cases • Need “data epicurve”: time series of attributable counts above background • Plausible to assume proportional to epidemic curve of infected • Sartwell lognormal model gives idealized shape for a given disease type observed modeled 10 8 6 4 2 0 0 10 20 30 40 Days after Exposure Sartwell, PE. The distribution of incubation periods of infectious disease. Am J Hyg 1950; 51:310:318 50 Signal Modeling: Realizations of Smallpox Epicurve “maximum likelihood” epicurve Each symptomatic case a random draw Assessing Algorithm Performance Summary processing: measure dependence of sensitivity or timeliness on false alert rate (ROC or AMOC curves or key sample values at practical rates) Sensitivity/Specificity as a function of threshold: Receiver Operating Characteristic (ROC) Timeliness/Specificity as a function of threshold: Activity Monitor Operating Characteristic (AMOC) False Alert Rate (1 – specificity) False Alert Rate (1 – specificity) Detection Performance Comparison Fever_Labbaji, lognormal signal 1.00 EWMA 0.90 EARS C2 EARS C3 (CUSUM) Detection Probability 0.80 0.70 0.60 0.50 0.40 0.30 0.20 0.10 0.00 0 10 20 30 40 50 60 Background Recurrence (days) 70 80 90 Quality Control Charts and Health Surveillance Benneyan JC, Statistical Quality Control Methods in Infection Control and Hospital Epidemiology, Infection and Hospital Epidemiology, Vol. 19, (3)194-214 Part I: Introduction and Basic Theory Part II: Chart use, statistical properties, and research issues • 1998 Survey article gives 135 references • Many applications: monitoring surgical wound infections, treatment effectiveness, general nosocomial infection rate, … Monitoring process for “special causes” of variation • Organize data into fixed-size groups of observations • Look for out-of-control conditions by monitoring mean, standard deviation,… • General 2-phase procedure: Phase I: Determine mean m, standard deviation s of process from historical “in-control” data; control limits often set to m 3s Phase II: Apply control limits prospectively to monitor process graphically Adaptation of Traditional Process Control to Early Outbreak Detection On adapting statistical quality control to biosurveillance: Woodall , W.H. (2000). “Controversies and Communications in Statistical Process Control”, Journal of Quality Technology 32, pp. 341-378. • “Researchers rarely…put their narrow contributions into the context of an overall SPC strategy. There is a role for theory, but theory is not the primary ingredient in most successful applications.” Woodall , W.H. (2006, in press). “The Use of Control Charts in Health Care Monitoring and Public Health Surveillance” • • “In industrial quality control it has been beneficial to carefully distinguish between the Phase I analysis of historical data and the Phase II monitoring stage” “It is recommended that a clearer distinction be made in health-related SPC between Phase I and Phase II…” Does infectious disease surveillance require an “ongoing Phase I” strategy to maintain robust performance? Statistical Process Control in Advanced Disease Surveillance Key application issues: • Background data characteristics change over time – Hospital/clinic visits, consumer purchases not governed by physical science, engineering – But monitoring requires robust performance: algorithms must be adaptive • Target signal: effect of infectious disease outbreak – Transient signal, not a mean shift – May be sudden or gradual The Challenge of Data Modeling for Daily Health Surveillance • Conventional scientific application of regression – Do covariates such as age, gender affect treatment? Does treatment success of differ among sites if we control for covariates? – Studies use static data sets with exploratory analysis • In surveillance, we model to predict data levels in the absence of the signal of interest – Need reliable estimates of expected levels to recognize abnormal levels – Data sets dynamic—covariate relationships change The Challenge of Data Modeling for Daily Health Surveillance, cont’d Modeling to generate expected data levels – – Predictive accuracy matters, not just strength of association or overall goodness-of-fit For a gradual outbreak, recent data can “train” model to predict abnormal levels Alerting decisions based on model residuals Residual = observed value – modeled value Conventional approach: – – assume residuals fit a known distribution (normal, Poisson,…) hypothesis test for membership in that distribution For surveillance, can also apply control-chart methods to residuals Monitoring Data Series with Systematic Features • Problem: How to account for short-term trends, cyclic data features in alerting decisions? • Approaches – Data Modeling • Regression: GLM, ARIMA, others & combinations – Signal Processing • LMS filters and wavelets – Exponential Smoothing: generalizes EWMA Example: OTC Purchasing Behavior Influenced by Many Factors Loglinear Regression Example: Tracking Daily Sales of Flu Remedies Log(Y) = b0 + b1-6d + b7t + b8-9h +b10w + b11p + e daily count of anti-flu sales day of week linear (6 indicators) trend harmonic (seasonal) weather (temp.) sales promotion (indicator) deviation (Poisson dist.) Recent Surveillance Method Based on Loglinear Regression Modeling emergency department visit patterns for infectious disease complaints: results and application to disease surveillance Judith C Brillman , Tom Burr , David Forslund , Edward Joyce , Rick Picard and Edith Umland BMC Medical Informatics and Decision Making 2005, 5:4, pp 1-14 http://www.biomedcentral.com/content/pdf/1472-6947-5-4.pdf Modeling visit counts on day d: Let S(d) = log ( visits(day d) + 1 ), the “started log” S(d) = [Σi ci × Ii(d)] + [c8 + c9 × d] + [c10 × cos(kd) + c11 × sin(kd)], k = 2π / 365.25 c1-c7 day-of-week effects c9 long-term trend c10-c11 seasonal harmonic terms Training period: 3036 days ~ 8.33 years Test period: 1 year Brillman et. al. Figure 1 EWMA Monitoring • Exponential Weighted Moving Average • Average with most weight on recent Xk: Sk = wS k-1 + (1-w)Xk, where 0 < w < 1 • Test statistic: Sk compared to expectation from sliding baseline Basic idea: monitor (Sk – mk) / sk Exponential Weighted Moving Average 60 Daily Count Smoothed 50 40 30 20 10 0 02/25/94 • • 03/02/94 03/07/94 03/12/94 03/17/94 03/22/94 03/27/94 Added sensitivity for gradual events Larger w means less smoothing 04/01/94 EWMA Concept & Smoothing Constant Brown, R.G. and Meyer, R.F. (1961), "The Fundamental Theorem of Exponential Smoothing," Operations Research, 9, 673-685. • Exponential smoothing represents “an elementary model of how a person learns”: xk = xk-1 + w (xk - xk-1) where 0 < w < 1 • For the smoothed value Sk, Sk = wS k-1 + (1-w)Xk , The variance of Sk is sS = [w / (2 - w)] sX • So a smaller w is preferred because it gives a more stable Sk; values between 0.1 and 0.3 often used • But Chatfield: changes in global behavior will result in a larger optimal w Generalized Exponential Smoothing Holt-Winters Method: modeling level, trend, and seasonality http://www.statistics.gov.uk/iosmethodology/downloads/ Annex_B_The_Holt-Winters_forecasting_method.pdf Forecast Function: yˆ n k |n = (m k b ) (c n n n-sk ) where: mj = level at time j, bj = trend at time j, cj = periodic multiplier at time j s = periodic interval k = number of steps ahead and mj, bj, cj are updated by exponential smoothing Holt-Winters Updating Equations Updating Equations, multiplicative method: Level at time t: yt mt = (1- ) ( mt -1 bt -1 ) , ct - s 0 < <1 bt = b ( mt - mt - 1 ) (1- b ) bt - 1 , 0 < b <1 yt Periodic multiplier ct = (1 - ) ct -s , 0 < <1 mt at time t: Slope at time t: And choice of initial values m0, b0, c0,…cs-1 should be calculated from available data Forecasting Local Linearity: Automatic vs Nonautomatic Methods Chatfield, C. (1978), "The Holt-Winters Forecasting Procedure," Applied Statistics, 27, 264-279. Chatfield, C.and Yar, M. (1988), "Holt-Winters Forecasting: Some Practical Issues, " The Statistician, 37, 129-140. • “Modern thinking favors local linearity rather than global linear regression in time…” • “Local linearity is also implicit in ARIMA modelling…” – Simple EWMA ~ ARIMA(0,1,1) – EWMA + trend ~ ARIMA(0,2,2) – Multiplicative Holt-Winters has no ARIMA equivalent • “Practical considerations rule out [Box-Jenkins] if there are insufficient observations or …expertise available” – “Box-Jenkins… requires the user to identify an appropriate… [ARIMA] model” For “fair” comparison of H-W to B-J, have both automatic or nonautomatic. Assertion: The simplicity of H-W permits easier classification, requiring less historic data. Can an automatic B-J give robust forecasting over a range of input series types? Regression vs Holt-Winters Results for Data Set: 1; with DOW and Seasonal Variation HW-RMSE = [57.401] RegressedRMSE = [61.1454] 600 Raw Data Holt Winters Regression 500 Ongoing study with Galit Shmueli, U. of MD Sean Murphy, JHU/APL Counts 400 300 200 100 0 30 time series, 700 days’ data 5 cities 3 data types 2 syndromes 50 100 150 200 250 300 350 200 250 300 350 Days Residuals 400 HoltWinters Regression 200 Counts Respiratory: seasonal & day-of-week behavior Gastrointestinal: day-of-week effects 0 0 -200 -400 0 50 100 150 Days Temporal Aggregation for Adaptive Alerting Data stream(s) to monitor in time: baseline interval Used to get some estimate of normal data behavior • Mean, variance • Regression coefficients • Expected covariate distrib. -- spatial -- age category -- % of claims/syndrome guardband test interval Avoids • Counts to be contamination tested for of baseline anomaly with outbreak • Nominally 1 day signal • Longer to reduce noise, test for epicurve shape • Will shorten as data acquisition improves Candidate Methods 1. Global loglinear regression of Brillman et. al. 2. Holt-Winters exponential smoothing fixed sets of smoothing parameters for data: with both day-of-week & seasonal behavior with only day-of-week behavior 3. Adaptive Regression Log(Y) = b0 + b1-6d + b7t + b8hol + b9posthol + e 56-day baseline, 2-day guardband b1-6 = day-of-week indicator coefficient b7 = centered ramp coefficient b8 = coefficient for holiday indicator b9 = coefficient for post-holiday indicator 1-day ahead and 7-day-ahead predictions Respiratory Visit Count Data --- Data --- Holt-Winters --- Regression --- Adaptive Regr. All series display this autocorrelation; good test for published regression model GI Visit Count Data --- Data --- Holt-Winters --- Regression --- Adaptive Regr. Stratified Residual Comparisons --- Data --- Holt-Winters --- Regression --- Adaptive Regr. Mean Residual Comparison • When mean residuals favor regression, difference is small, and this difference results from largest residuals • If the holiday terms in adaptive regression are removed, H-W means uniformly smaller Median Residual Comparison Residual Autocorrelation Comparison --- Data --- Holt-Winters --- Regression --- Adaptive Regr. Residual Autocorrelation Comparison 1-Day Ahead Predictions Residual Autocorrelation Comparison 7-Day Ahead Predictions Summary • Data-adaptive methods are required for robust prospective surveillance • Appropriate algorithm selection requires an automated data classification methodology, often with little data history • Statistical expertise is required to manage practical issues to maintain required detection performance as datasets evolve: – stationarity (causes rooted in population behavior, evolving informatics, others) – late reporting – data dropouts Research Directions • Classification of time series for automatic forecasting – Easier for Holt-Winters than for Box-Jenkins? – Determining reliable discriminants: • Autocorrelation coefficients • Simple means/medians • Goodness-of-fit measures – How little startup data history required? • Most effective alerting algorithm using residuals, given signal of interest – Apply control chart to residuals? – Need to detect both sudden, gradual signals – Detection performance constraints: • Minimum detection sensitivity • Maximum background alert rate