Download TSR

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Data assimilation wikipedia , lookup

Regression toward the mean wikipedia , lookup

Least squares wikipedia , lookup

Choice modelling wikipedia , lookup

Time series wikipedia , lookup

Regression analysis wikipedia , lookup

Linear regression wikipedia , lookup

Coefficient of determination wikipedia , lookup

Transcript
Epi-on-the-Island
Time Series Regression (TSR)
6-10 July 2015
Thursday 2: TSRs for
infectious diseases
Ben Armstrong, LSHTM, based strongly on:
Chisato Imai* et al Env Res In press 2015
* Nagasaki University
Standard TSR
(sorry – terminology change!)
The most common time series regression (TSR) is Poisson model
described as;
π‘Œπ‘‘ ~ π‘ƒπ‘œπ‘–π‘ π‘ π‘œπ‘›(µπ‘‘ )
π‘™π‘œπ‘” πœ‡π‘‘ = 𝜁0 + 𝜁π‘₯𝑑 +
πœ‚π‘ 𝑧𝑝,𝑑 + 𝑓 𝑑
𝑝
where f(t): smoothing function of time t to control seasonality
and long term trends.
xt : time varying variables of interest (e.g. temperature).
𝜻𝟎 , 𝜻, πœΌπ’‘ : regression coefficients.
zp,t : other risk factors.
Issues in using TSR for
infectious disease
1. Strong autocorrelation
β€œTrue contagion” due to transmissions among individuals.
2. Immune and susceptible population
The number of immune / susceptible population changes over time.
3. Unusual lag structures and association patterns
E.g. intermediators (such as vectors) on the causal pathways
4. Control for seasonality and long term trends
Time splines may impede finding longer lag effects of exposures.
5. Overdispersion
Very large .
Time series susceptible-infectiousrecovery(TSSIR) model*
π‘Œπ‘‘ =
𝛼
𝛽𝑑 π‘Œπ‘‘βˆ’1
π‘†π‘‘βˆ’1
π‘π‘‘βˆ’1
π‘š
𝛾
πœ€π‘‘
βˆ— 𝑆𝑑 = 𝑁𝑑 βˆ’
π‘Œπ‘‘βˆ’π‘– πœ…π‘–
𝑖=0
where N: the total population size,
S: the number of susceptible individuals,
Ξ²: is pathogen transmissibility at time t,
m: the max immune duration
Ξ±, Ξ³: mixing parameters, and
ΞΊ: the decay immune function
Taking logs leaves something like a TSR
log π‘Œπ‘‘ β‰… log 𝛽𝑑 + 𝛼 log π‘Œπ‘‘βˆ’1
𝛾
βˆ’
𝑁𝑑
π‘š
π‘Œπ‘‘βˆ’π‘– πœ…π‘– + log πœ€π‘‘
𝑖=1
* Koelle K, Pascual M. Am Nat 2004;163(6):901-13.
1. Strong autocorrelation
Usual TSR way to approach
– Lagged model residuals
Alternative suggested by TSSIR:
π‘Œπ‘‘ =
π‘†π‘‘βˆ’1 𝛾
𝛼
𝛽𝑑 π‘Œπ‘‘βˆ’1
πœ€π‘‘
π‘π‘‘βˆ’1
log π‘Œπ‘‘ β‰… log 𝛽𝑑 + 𝛼 log π‘Œπ‘‘βˆ’1 βˆ’
𝛾
𝑁𝑑
π‘š
π‘Œπ‘‘βˆ’π‘– πœ…π‘– + log πœ€π‘‘
𝑖=1
 i.e. use lagged log(Y) as x-variable
2. Strong autocorrelation
A
C
B
Tokyo influenza data
residuals
D
2. Immune & susceptible population
A: The problem, an ad hoc solution, and idea
Issue: disease -> immunity -> susceptibles β–Ό > disease β–Ό
EG (Lopman 2009) Norovirus: ~ 1 year immune duration
β€œImmune population factor”, xipf =
οƒ 
π‘™π‘Žπ‘ π‘‘ π‘¦π‘’π‘Žπ‘Ÿ π‘Œπ‘‘βˆ’π‘™
Simplified form of Koelle’s TSSIR model. πœ…π‘– = 1 (last year)
otherwise 0.
π‘š
π‘Œπ‘‘βˆ’π‘– πœ…π‘– =
𝑖=0
π‘Œπ‘‘βˆ’1
π‘™π‘Žπ‘ π‘‘ π‘¦π‘’π‘Žπ‘Ÿ
2. Immune & susceptible population
B: Options
 Rely on time smooth to take account of change in
suscetibles
 Use some (weighted) sum of past cases as xvariable (eg, cases so far in season)
 Consider only timing of onset of peak of epidemic
(eg does cold weather precipitate an early flu
peak?) [~ survival analysis with time-dependent
covariates]
3. Unusual lag structures and
association patterns
Example:
Lyme disease ~ rain 2 years lag (Subak 2003)
 Be informed by prior knowledge.
– Biological plausibility & preceding studies can be
informative for the duration of delayed effects and the
nature of association patterns (linear, U-, J-shaped)
 Be prepared to use flexible lag/shape functions.
– DLNMs !~
Unusual lag and shape (cholera)
4. Control for seasonality and
long term trends.
 Choose
suitable unit of time data or respect
restrictions (week or month rather than daily)
– Time unit should make sense with biological
plausibility (e.g. incubation period) of
hypothesized association lag
– Attentions to details : 52 weeks or 53 weeks
p.a.
 Avoid depletion of the precision to estimate
the longer lag effects of exposures by a time
spline function
o Separate functions for seasonal and long term
patterns for long lag effects. (Use fourier terms
and simple functions for long term trend)?
5. Overdispersion
 Consider different distribution models
allowing for overdispersion
 Quasi-Poisson and negative binomial
 Zero-inflated Poisson and negative binomial
For excessive number of zero counts
 Gaussian linear model for log(Yt)
If outcome counts are consistently large enough
Choosing overdispersion model
Ver Hoef’s (2007) method
Approaches other than TSR
β€’ TSSIRs
β€’ ARIMA etc (including fractional ARIMA)
β€’ Wavelets
Summary of models as fitted to Tokyo influenza data
models
TSR with quasi-Poisson
standard TSR
+ autocorrelations
(AC)d
+ AC
+ AC
+ AC
+ AC + immune term
up to onsets
up to peaks
AC term
immune
term
distribution
temperature effect
estimate % (95%
CI)b
dispersion
parameter
count
-
-
QP
-5.8 (-10.9, -0.5)
349.4
count
π‘Ÿπ‘’π‘ π‘–π‘‘π‘’π‘Žπ‘™π‘‘βˆ’1
-
QP
-3.7 (-6.9, -0.5)
118.4
count
count
count
count
count
count
π‘Œπ‘‘βˆ’1
log⁑
(π‘Œπ‘‘βˆ’1 + 1)
log⁑
(π‘Œπ‘‘βˆ’1 + 0.5)
log⁑
(π‘Œπ‘‘βˆ’1 + 1)
log⁑
(π‘Œπ‘‘βˆ’1 + 1)
log⁑
(π‘Œπ‘‘βˆ’1 + 1)
Ξ£(cases*)
-
QP
QP
QP
QP
QP
QP
-4.8 (-8.6, -0.8)
-5.5 (-7.5, -3.4)
-5.5 (-7.5, -3.4)
-6.7 (-8.7, -4.6)
-5.7 (-12.0, 1.0)
-4.5 (-8.9 to 0.1)
188.7
48.4
49.0
47.1
9.0
67.7
log⁑
(π‘Œπ‘‘βˆ’1 + 1)
log⁑
(π‘Œπ‘‘βˆ’1 + 1)
Ξ£(cases*)
Ξ£(cases*)
NB
Gaussian
-6.6 (-10.0, -3.0)
-5.2 (-8.6, -1.6)
na
na
onset (1,0)
log⁑
(π‘Œπ‘‘βˆ’1 + 1)
-
Bernoulli
-19.5 (-49.9, 29.4)
na
onset (1,0)
log⁑
(π‘Œπ‘‘βˆ’1 + 1)
-
CB
-16.0 (-46.7, 32.4)
na
peak (1,0)
log⁑
(π‘Œπ‘‘βˆ’1 + 1)
-
Bernoulli
-50.7 (-75.3, -1.7)
na
peak (1,0)
log⁑
(π‘Œπ‘‘βˆ’1 + 1)
-
CB
-51.6 (-78.3, 7.7)
na
outcome
(Y)
TSR with different distribution models
negative binomial
count
linear regression
log(count+1)
non-TSR modelse
Onset: logistic
regression
Onset: cox regression
Peak: logistic
regression
Peak: cox regression
predictors
Conclusions and discussions
οƒ˜TSR models can be used for studies of
infectious disease and weather, but may
require modifying.
οƒ˜TSR is not dominant for ID TS Epi;
TSSIRs and ARIMA as frequent.
οƒ˜Future: TSSIR – TSR hybrid?