Download An Introductory Lecture to Environmental Epidemiology Part 2. Time

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Seven Countries Study wikipedia , lookup

Race and health in the United States wikipedia , lookup

Transcript
An Introductory Lecture to
Environmental Epidemiology
Part 2. Time Series Studies
Mark S. Goldberg
INRS-Institut Armand-Frappier,
University of Quebec, and
McGill University
July 2000
• The first example in Part 1 (the landfill site
study) dealt with analyses of airborne air
pollutants using geographic regions to
represent presumptive levels of exposure.
Standard methods for ecological studies
(comparison of rates by region) as well as
more complex methods using higher quality
data (population-based case-control study)
were compared.
• The second study (the Harvard Six Cities
Study) used the city as the geographical
unit, and a complex longitudinal cohort
analysis compared adjusted mortality rates
between cities. Because of the relative
homogeneity of ambient fine particles, it is
likely that there was less exposure
misclassification than in the landfill site
study.
• The following example shows an analysis
of air pollution over a circumscribed
geographic region across time. The purpose
is to determine whether daily counts of
death increase if air pollution on the same
day or on previous days increases. Thus, the
study is used to investigate acute effects
rather than chronic effects.
Example: Time Series Studies
• Objective: To determine whether the daily
number of deaths increases when air
pollution increases on that day or on
preceding days.
• Method: Juxtapose a time series of deaths
with a time series of air pollution.
• Target population consists of all persons
living in a well-circumscribed geographical
area
• Study is not entirely ecological in that there
are no comparisons by place. There are,
however, comparisons by time.
• There are no denominators.
• Exposures: Daily measurements from
fixed-site monitors
• Confounding factors: Any factor that
varies on short time scales and is associated
with daily mortality (e.g., weather patterns,
influenza epidemics). Smoking can not be a
confounding variable unless patterns of
consumption change on the scale of days.
Time Series Plot of All
Nonaccidental Mortality
• The following graph shows the time series
of daily counts of nonaccidental deaths by
day among residents of Montreal, Quebec,
who died in the city during the 10-year
period 1984-1993. There are peaks in the
winter. The solid line is a running smooth
showing the slow increase in mortality by
time.
Number of deaths per day
80
60
40
20
Days 500 from 1500 January2500 1, 19843500
Time Series Plot of the
Coefficient of Haze
• The following graph shows the time series
of the coefficient of haze (a measure of
ambient carbon particles), by day. Daily
means from about 11 monitoring stations in
the city are combined in the plot. As with
mortality, there are peaks in the winter. The
solid line is a LOESS smooth showing the
secular trend in air pollution in Montreal
from 1984-1993.
Coefficient of haze (linear meters)
15
10
5
0
Number
500
of
days
since
1500
January 1, 1984
2500
3500
• Analysis: Must account for
– non-independence of daily counts of
death (serial autocorrelation)
– overdispersion from a Poisson process
(variance > mean number of daily deaths)
• Method of Analysis: Poisson regression
(using quasi-likelihood; see Hastie and
Tibshirani, 1990).
• Statistical model:
– E(log(Yi)) =  + f1(timei) +
f2(meteorologyi) + f3(pollutioni) + ….
– Covariance corrected for non-Poisson
variation
• This is a complex statistical model that
makes use of quasi-likelihood within the
context of the Generalized Additive Models
(Hastie & Tibshirani, 1990). The Yi variable
are counts of deaths that, because of
clustering in time, do not follow a Poisson
distribution (mean  variance). The quasilikelihood method allows the estimation of
a dispersion parameter that corrects the
• The second term is referred to as a temporal
filter. It is an arbitrary function of the data,
and its functional form is governed by the
data itself and by the amount of smoothing
that is required. This term is used to adjust
out the seasonal and sub-seasonal patterns
in the mortality time series. In addition, it
will remove serial autocorrelation, which
can be substantial, thereby making the Yi
independent.
• The smoothing parameter can be
determined by appealing to the Bartlett
statistic to determine whether the residual
mortality time series is consistent with a
white noise process (Priestly, 1980).
• The third term is also a smooth
nonparametric function of the data, and is
used to adjust for the effects of weather and
other variables that vary over short time
scales.
Filtered Mortality by Time
• The first goal of the statistical analysis is to
remove seasonal and sub-seasonal
variations in mortality that are unrelated to
air pollution. By judiciously choosing a
filter in the statistical model (the f1 term in
the model) using Bartlett’s test, we are able
to produce a residual time series that is
consistent with a white noise process.
Observed/fitted number of deaths
2.0
1.5
1.0
0.5
500
1500
2500
Time in days
3500
• The regression then adjusts for the effects of
the weather variables (the f2 term) and the
resultant estimate for air pollution is
unbiased.
Exposure-response function for
nonaccidental mortality and COH
• The following graph shows the residual
effect of coefficient of haze on
nonaccidental mortality. This represents the
fourth term in the statistical model (the f3
term). The y-axis is on a natural logarithmic
scale. The solid line represents the
nonparametric smooth function and the
broken lines are pointwise 95% confidence
intervals.
Natural logarithm of daily counts of death
0
5
10
15
Coefficient of haze
Nonaccidental Deaths by Age
Group
• The following graph shows the estimated
increase in daily mortality (in percent) for
an increase in air pollution equal to the
inter-quartile range for a variety of indices
of ambient air particles, by age group. The
top and lower horizontal bars represent 95%
confidence limits, and vertical bars not
crossing zero are statistically significant.
4
< 65 > 65
3
2
1
0
-1
-2
COH
Predicte
Predicte Sutto d PM2.
Extinction PM2.
sulfat
sulfat
d
n
5
5
e
e
• The increase in mortality is small, in the
order of a few percent. However, all persons
are exposed, so the attributable risk could
be theoretically high. For a discussion of
some of the more technical issues in
interpreting these results, see Schwartz,
1994; Goldberg et al., 2000.
References
• Environmental Epidemiology
• Hertz-Piccioto, I. “Environmental Epidemiology”,
in Rothman and Greenland: Modern
Epidemiology, Second edition, Lippincott-Raven
Publishers, 1998, Philadelphia, Chapter 28, pages
555-583.
• Generalized additive models
• Hastie, T., and Tibshirani, R. (1990). London:
Chapman and Hall. These models are
implemented in Splus (http://www.mathsoft.com).
• Statistical considerations in time series studies
Priestly, MB. Spectral Analysis of Time Series.
1981, Academic Press, New York.
• Schwartz J. Nonparametric smoothing in the
analysis of air pollution and respiratory illness.
Cdn J Stat 1994; 22:471-87
• Identifying subgroups of the general population
that may be susceptible to short-term increases
in particulate air pollution: A time series study
in Montreal, Quebec.
• Goldberg, M.S., Bailar, J.C. III, Burnett, R.,
Brook, J., Tamblyn, R., Bonvalot, Y., Ernst, P.,
Flegel, K.M., Singh, R., and Valois, M.-F. (2000)
Health Effects Institute, Cambridge, MA
(http://www.healtheffects.org).