Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Continuous-time microsimulation in longitudinal analysis Frans Willekens Netherlands Interdisciplinary Demographic Institute (NIDI) ESF-QMSS2 Summer School “Projection methods for ethnicity and immigration status”, Leeds, 2 – 9 July 2009 What is microsimulation? A sample of a virtual population • Real population vs virtual population – – Virtual population is generated by a mathematical model If model is realistic: virtual population ≈ real population • Population dynamics – Model describes dynamics of a virtual (model) population • • Macrosimulation: dynamics at population level Microsimulation: dynamics at individual level (attributes and events – transitions) Discrete-event simulation • What is it?: “the operation of a system is represented as a chronological sequence of events. Each event occurs at an instant in time and marks a change of state in the system.” (Wikipedia) • Key concept: event queue: The set of pending events organized as a priority queue, sorted by event time. Types of observation • Prospective observation of a real population: longitudinal observation – – In discrete time: panel study In continuous time: follow-up study (event recorded at time occurrence) • Random sample (survey vs census) – Cross-sectional – Longitudinal: individual life histories Longitudinal data sequences of events sequences of states (lifepaths, trajectories, pathways) • Transition data: transition models or multistate survival analysis or multistate event history analysis – Discrete time: • Transition probabilities • Probability models (e.g. logistic regression • Transition accounts – Continuous time • Transition rates • Rate models (e.g. exponential model; Gompertz model; Cox model) • Movement accounts • Sequence analysis: Abbott: represent trajectory as a character string and compare sequences Why continuous time? When exact dates are important • Some events trigger other events. Dates are important to determine causal links. • Duration analysis: duration measured precisely or approximately – – – – Birth intervals Employment and unemployment spells Poverty spells Duration of recovery in studies of health intervention • To resolve problem of interval censoring – Time to the ‘event’ of interest is often not known exactly but is only known to have occurred within a defined interval. What is continuous time? • Precise date (month, day, second) – Month is often adequate approximation => discrete time converges to continuous time • Transition models: dependent variable – – Probability of event (in time interval): transition probabilities Time to event (waiting time): transition rates Time to event (waiting time) models in microsimulation • Examples of simulation models with events in continuous time (time to event) – – – Socsim (Berkeley) Lifepaths (Statistics Canada) Pensim ((US Dept. of Labor) “Choice of continuous time is desirable from a theoretical point of view.” (Zaidi and Rake, 2001) Time to event (waiting time) models in microsimulation Time to event is generated by transition rate model • Exponential model: (piecewise) constant transition (hazard) rate • Gompertz model: transition rate changes exponentially with duration • Weibull model: power function of duration • Cox semiparametric model • Specialized models, e.g. Coale-McNeil model Time to event is generated by transition rate model How? Inverse distribution function or Quantile function Quantile functions • Exponential distribution (constant hazard rate) – Distribution function – Quantile function F (t ) 1 exp[ t ] G ( ) ln [1 ] • Cox model – Distribution function F (t Z) 1 exp H 0 (t ) exp β' Z – Quantile function G( Z) H 01 ln( ) exp β' Z Parameterize baseline hazard Two- or three-stage method • Stage 1: draw a random number (probability) from a uniform distribution • Stage 2: determine the waiting time from the probability using the quantile function • Stage 3: – – in case of multiple (competing) events: event with lowest waiting time wins in case of competing risks (same event, multiple destinations): draw a random number from a uniform distribution Illustration • If the transition rate is 0.2, what is the median waiting time to the event? F (t ) 1 exp[ t ] 1 exp[ 0.2 t ] 0.5 G( ) ln [1 ] ln[ 1 0.5] 3.466 years 0.2 The expected waiting time is 1 1 5 years 0.2 Illustration Exponential model with =0.2 and 1,000 draws 1.0 0.4 Surv_sample 0.9 0.8 Haz_sample (by duration) 0.3 Haz_true 0.7 Haz_sample (mean) 0.25 0.6 0.5 0.2 0.4 0.15 0.3 0.1 0.2 0.05 0.1 0.0 0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Duration Hazard rate Survival probability 0.35 Surv_true Table 1 Number of occurrences, given =0.2 Random samples of 1000 transitions Number of subjects Random Random Random Expected by number of sample sample sample values occurrences 1 2 3 within a year 0 829 797 828 819 1 152 189 153 164 2 18 12 17 16 3 1 2 2 1 4 0 0 0 0 5 0 0 0 0 1000 1000 1000 1000 191 219 193 200 Total Total number of occurrences within a year Table 1 Times to transition Random samples of 1000 transitions and expected values Number of subjects by number of occurrences within a year Random Random Random sample 1 sample 2 sample 3 Expected values 1 0.504 0.478 0.483 0.483 2 0.672 0.705 0.700 3 0.960 0.740 0.596 4 - - - 5 - - - Multiple origins and multiple destinations State probabilities 1000 900 800 700 600 500 400 300 200 100 0 Dead Disabled Healthy 0 1 2 3 4 5 6 7 8 9 10 Lifepaths during 10-year period Sample of 1,000 subject; =0.2 Name Mean age at transition Pathway Number 1 325 HD 2 217 H 3 161 H+ 4.03+ 4 150 HD+ 2.68D 5.67+ 5 84 HDH 3.32D 6.79H 6 40 HDHD 2.36D 4.88H 7.25D 7 11 HDH+ 1.96D 4.15H 5.77+ 8 7 HDHDH 1.49D 2.85H 5.74D 7.67H 9 3 HDHD+ 1.64D 3.86H 4.97D 6.78+ 10 2 HDHDHD 3.38D 3.92H 6.50D 8.04H 8.17D 4.24D Conclusion • Microsimulation in continuous time made simple by methods of survival analysis / event history analysis. • The main tool is the inverse distribution function or quantile function. • Duration and transition analysis in virtual populations not different from that in real populations