Download APPSIM - Modelling fertility and mortality

Document related concepts
no text concepts found
Transcript
National Centre for Social and Economic Modelling
• University of Canberra •
APPSIM - Modelling fertility and
mortality
Sophie Pennec and Bruce Bacon
APPSIM Working Paper No. 7
December 2007
About NATSEM
The National Centre for Social and Economic Modelling was
established on 1 January 1993, and supports its activities through
research grants, commissioned research and longer term contracts
for model maintenance and development with the federal
departments of Family and Community Services,
Employment and Workplace Relations, Treasury, and
Education, Science and Training.
NATSEM aims to be a key contributor to social and economic
policy debate and analysis by developing models of the highest
quality, undertaking independent and impartial research, and
supplying valued consultancy services.
Policy changes often have to be made without sufficient
information about either the current environment or the
consequences of change. NATSEM specialises in analysing data
and producing models so that decision makers have the best
possible quantitative information on which to base their decisions.
NATSEM has an international reputation as a centre of excellence
for analysing microdata and constructing microsimulation
models. Such data and models commence with the records of real
(but unidentifiable) Australians. Analysis typically begins by
looking at either the characteristics or the impact of a policy
change on an individual household, building up to the bigger
picture by looking at many individual cases through the use of
large datasets.
It must be emphasised that NATSEM does not have views on
policy. All opinions are the authors’ own and are not necessarily
shared by NATSEM.
Director: Ann Harding
© NATSEM, University of Canberra 2007
National Centre for Social and Economic Modelling
University of Canberra ACT 2601 Australia
170 Haydon Drive Bruce ACT 2617
Phone + 61 2 6201 2780 Fax + 61 2 6201 2751
Email [email protected]
Website www.natsem.canberra.edu.au
iii
Abstract
This paper first reviews the latest international practice when modelling fertility and
mortality in dynamic microsimulation models. The next section presents a proposed
order for the modelling of demographic processes in APPSIM, Australia’s new
dynamic microsimulation model. The final section describes the first set of fertility
estimates proposed for use in APPSIM, with these estimates having been calculated
from the HILDA panel dataset.
iv
Author notes
Dr Sophie Pennec is a demographer and researcher at the National Institute of
Demographic Studies (INED, Paris-France) and visiting researcher at The National
Centre for Social and Economic Modelling (NATSEM). Professor Bruce Bacon is an
Adjunct Professor of the University of Canberra at NATSEM.
Acknowledgments
The authors would like to acknowledge comment and input from Ann Harding,
Mandy Yap and Simon Kelly. They would also like to gratefully acknowledge the
funding and support provided by the Australian Research Council (under grant
LP0562493), and by the 13 research partners to the grant: Treasury; Communications,
Information Technology and the Arts; Employment and Workplace Relations; Health
and Ageing; Education, Science and Training; Finance and Administration; Families,
Community Services and Indigenous Affairs; Industry, Tourism and Resources;
Immigration and Citizenship; Prime Minister and Cabinet; the Productivity
Commission; Centrelink; and the Australian Bureau of Statistics.
This paper uses unconfidentialised unit record file from the Household, Income and
Labour Dynamics in Australia (HILDA) survey. The HILDA project was initiated
and is funded by the Commonwealth Department of Families, Community Services
and Indigenous Affairs (FaCSIA) and is managed by the Melbourne Institute of
Applied Economic and Social Research (MIAESR). The findings and views reported
in this paper, however, are those of the authors and should not be attributed to either
FaCSIA or the MIAESR.
v
Contents
Abstract
iii
Author notes
iv
Acknowledgments
iv
1
Introduction
1
2
Demographics in dynamic microsimulation models
2
2.1 Destinie (France)
9
3
4
5
2.2 DYNASIM III (US)
11
2.3 SAGE (UK)
12
2.4 LifePaths (Canada)
14
2.5 DYNAMOD II (Australia)
15
2.6 MOSART (Norway)
17
2.7 SVERIGE (Sweden)
18
2.8 SESIM (Sweden)
18
Structure of Demographics in APPSIM
19
3.1 Starting population
20
3.2 Order of the modules
20
Mortality in APPSIM
23
4.1
23
Mortality
4.2 Calibration/Alignment
24
Fertility in APPSIM
25
5.1
Data on fertility
25
5.2
Fertility in HILDA
28
5.3
Fertility estimates
28
5.4 Calibration/Alignment
39
5.5 Implementation in APPSIM
39
References
42
1
1
APPSIM. Modelling fertility and mortality.
Introduction
This paper has been prepared as one of a series associated with the development of
the Australian Population and Policy Simulation Model (APPSIM). The APPSIM
dynamic population microsimulation model is being developed as part of an
Australian Research Council (ARC) Linkage grant (LP0562493), and will be used by
Commonwealth Government policy makers and other analysts to assess the social
and fiscal policy implications of Australia’s ageing population. The particular focus
of this paper is how to most effectively model fertility and mortality within APPSIM.
Fertility has been a major topic of interest for demographers for some decades. One
reason is that the baby boom that occurred after WWII increased the population from
the end of the 40s to the beginning of the 60s. This bulge in the population was not
regarded as a problem until recently. However, in the near future, this large
population cohort will reach retirement age, resulting in slower economic growth
and increasing the burden of caring for them (in terms of income through retirement
pensions or in terms of health and the need for formal care and carers). As a result of
these pressures, many have questioned the sustainability of current welfare schemes
in some developed countries (e.g. see the studies in Harding and Gupta, 2007). The
baby boomers are also responsible for initiating the ‘baby bust’ — that is, the low
fertility period experienced during the past few decades.
Today, with the legalisation of, and the wide access to, contraception, the number of
children a person has mainly reflects their choice. Of course for some persons, this
choice is constrained by biological factors (sterility), by social factors (no partners
during reproductive life) or even by economic ones. This choice results from the
interaction of a range of different variables — such as family context, number of
siblings, socio-economic status of parents that affect the education level of children,
partnership status, and the socio-economic status of the couple.
When modelling fertility, one must pay attention not only to the number of children
(quantum of fertility) but also to the time schedule of fertility (tempo). The effect of
fertility upon such factors as careers and loss of wages (between a woman who stops
working to raise her children and a woman who remains in the labour force) can
differ greatly. Some parents will move to or choose a part time job. To have a good
representation of the spacing between births is also essential — particularly as some
policy entitlements depend on the number of children aged under a certain age or
when studying kinship and the likely number of children available for caring for
elderly parents. A 65 year old retiree who is not disabled can more easily care for his
APPSIM Working paper no. 5
2
or her disabled parents than a 55 year old person still in labour force, or an 80 year
old person who is also disabled.
2
Demographics in dynamic microsimulation
models
Demographics are very often the backbone of any dynamic microsimulation models.
They are also usually the first step of the simulation. Microsimulation models rely on
individuals, so if the number of persons and their characteristics are not well
estimated, not only the results for population aggregates will be inaccurate but the
results for income, social transfers, and the effects of policy will not be reliable either.
Even in models that are not population-focused, this module is important.
Microsimulation modelling has been used for a long time by demographers to
investigate fertility. Since its definition by Guy Orcutt (Orcutt, 1961)), whose first
model included an extensive and detailed demographic module, a number of
biological and demographic models have been built in the USA (Sheps et al., 1973), in
Sweden (Hyrenius and Adolfsson, 1964; Hyrenius et al., 1966), and in France
(Léridon, 1977). The biological fertility models are women-based and reconstitute the
monthly fecundity and fertility processes during the reproductive life. Recently,
Léridon (Léridon 2004) used such a model to investigate the effect of the
postponement of childbearing age on the total number of children — and, in
particular, on the number of children that people will not have due to age-acquired
sterility that prevents them from having the number of children they would like to
have and its effect on the total fertility rate. The demographic models (Hyrenius et
al., 1966), Schofield (Smith, 1987); and more recently the Family and Fertility
Surveys (Spielauer and Vencatasawmy, 2003) also investigate fertility but using
different data, age and parity-specific fertility rates, instead of using conception rates,
and then following the outcome of each conception. They investigate the fertility
process both from the quantum and tempo point of view.
Historical demography has also heavily used microsimulation methods to
reconstitute the population of the past (Hammel et al., 1990), (Smith, 1987), Ruggles
(Ruggles, 1987), to investigate demographic changes over time, to try to understand
how to reconcile macro and microdata and to understand the dynamics underlying
the results of two datasets of the same population.
A related use, and often one of the main interests of demographers using
microsimulation, is that it allows the study of kinship. Whether they live in the same
household or not, ties between members of a family can be kept and the size and
3
APPSIM. Modelling fertility and mortality.
characteristics of families can be investigated (Hammel et al., 1981), (Imhoff and Post,
1997), (Le Bras, 1973), (Pennec, 1997).
With the ageing population and the potential threat to the care of the disabled
elderly, microsimulation is the perfect tool to estimate the future population of
potential carers. As the ties between the different members of the family are kept,
the number of carers can be computed — not only their number, but also their
characteristics (age, sex, being in the labour force or not, being retired or not, being
disabled themselves or not) (Keefe et al., 2004), (Duée et al., 2005; Tomassini and
Wolf, 2000).
This is why many models use a set of covariates to try to create a better distribution
and include some time and birth spacing variables. This is, of course, critical in the
case of the demographic models, as it is their aim to mimic fertility both in quantum
and in tempo. But this is often not the only goal: more policy oriented models often
also try to have a good representation of fertility and of family structures.
The following section of this paper outlines how demographics have been modelled
in different dynamic microsimulation models. For some, demography is only a
means to achieve a plausible, total number — and, maybe, the age and sex structure
of the population. In that case, the modelling is therefore quite simple. But most
models include quite detailed demographic modelling, because they need to also
track the characteristics of the population as a whole and, very often, household and
the family characteristics. For example, the family is often a decision-making unit in
terms of economic behaviour and such expenses as housing. Welfare benefits,
taxation schemes and some other variables are also often family-based.
Tables 1 and 2 give some examples of how births and deaths have been modelled in
models like APPSIM all around the world. It shows the different modelling
approaches and the covariates that came through the imagination of the modellers.
This is a good starting point when investigating approaches for new modelling, as
some “regularities in the covariates can be seen”.
In summary, Table 1 suggests the modelling of fertility is mainly based on
demographic data, such as age of the mother, marital status, duration since
beginning of the union/marriage, number of previous children, and duration since
previous birth. But, whenever possible, data on socio-economic status are also
included — either by education level or by income level. A third set of variables are
more dedicated to the modelling of the individual’s current situation, such as being
in full time education; being in the labour force, or having a very young child.
Where fertility is concerned, different benchmarks must be used — some period ones
(like total number of births and total fertility rates); and some cohort ones (like the
distribution of women by number of children and completed fertility rates). Not only
APPSIM Working paper no. 5
4
the total number of children and their distribution, but the age at childbearing and
the duration between two births are of importance and must be included in the
modelling (as they can impact on the labour force participation of the mother, or
change the receipt of benefits that vary according to the number or age of children, or
are family related). Childlessness is another important phenomenon that has to be
taken into account, as it has a strong impact of some of the topics related to ageing.
Mortality is a one time event and therefore is easier to model than fertility. Age and
sex mortality rates are the basic indicators used in the models and, whenever
possible, these rates also include health, family and socio-economic variables. Health
variables are mainly related to disability, while socio-economic status can include
higher education level or being in the labour force or occupation (and, for those
retired, previous occupation).
Some models use survival analysis, but most of them use transition rates modelled
by regression techniques (hazard functions or logistic regression equations). We will
not go further in the comparison of models here as our aim when gathering the
covariates used in the different models is not to compare the models. Obviously, the
modelling of each event and the level of detail used depends on one side on the aim
of the model; some objectives and policies may be more or less related to
demographic behaviour. On the other hand, the availability of data severely
constrains modellers, to different degrees in different countries. As Harding points
out: ‘dynamic models are only as good as the data upon which they are based’
(Harding 1993, p. 29).
Table 1: Fertility covariates in selected dynamic microsimulation models
Births
Camsim
set of parity progression ratios, time between marriage and first birth. Only married women can bear a first child
but the birth interval distribution from marriage to first birth permits pre-marital conceptions (Smith, 1987)
CORSIM
Age, birth(t-1), birth(t-2), duration of current marriage, earnings, family income, homeowner status, marital
status, parity, schooling status, work status (F/T, P/T) , have child, marital status, race, work status
Demogen
age, marital status, parity
Destinie
parity 1: duration since end of education, age , education
parity 2 (if same union as parity 1): duration since previous birth, education, age
parity 3 and over (if same union as parity 2): duration since previous birth, education, age, parity
parity 2+ (if not same union as previous birth): duration since new union, age, parity (Duée, 2005)
DYNACAN
Age, marital status, education level, employment status, parity
Dynamite
age, marital status, parity
DYNAMOD
models pregnancy and the outcome of the pregnancy (still born, live birth).
3 distinct kinds of births: premarital, 1st marital and subsequent marital (marital=de jure and de facto)
equations based on women's education participation, educational qualifications, employment status, age.
Marital births are also based on marital status, employment status of the husband, duration of the relationship,
parity, duration since last birth (Abello et al., 2002)
DYNASIM III
Seven equation parity progression model; varies based on marital status; predictors include age, marriage
duration, time since last birth; uses vital rates after age 39 (age-race-parity-specific probabilities); sex of
newborn is assigned by race; (Favreault and Smith, 2004b)
Famsim
Age, Marital status, duration in marital status, in education, duration of schooling, in work, duration work, trend,
duration since last birth (birth parity 2+), parity (birth parity 4+) (Spielauer and Vencatasawmy, 2003)
Harding
age, sex and parity (only for married women) (Harding, 1993)
Irish dynamic microsimulation model
Age, parity, education level, father's education level, marital status
APPSIM. Modelling fertility and mortality
Model
5
APPSIM Working paper no. 5
6
Italian cohort model
age of mother, parity
Japanese dynamic model
age, marital status, parity
kinsim
age, marital status, parity (Imhoff and Post, 1997)
Biological models
age, marital status, time since last pregnancy according to outcome, sterility (Hyrenius and Adolfsson, 1964;
Léridon, 1977)
LIFEMOD
marital status, age, parity progression ratio
LifePaths
age, marital status, parity, time [education, birth cohort, age at marriage, province of birth; the decision to have
a child is modelled and then the different steps towards the birth, 1) each birth occurs nine months after
conception, 2) a spell of infertility lasting three months follows each birth and 3) a new fertile spell begins
immediately after that. As a side benefit, the careful attention given to the timing of these events makes
possible straightforward models of, for example, maternity leaves from employment and marriage following
pregnancy.](Statistics Canada, 2006)
Melbourne
age of the mother, marital status, number of dependants
Microhus
hourly wage rate, disposable income, education, years of work experience, duration since previous child, other
socio-economic variables
MOSART
age, parity, age of youngest child (Fredriksen, 2003)
Nedymas
age, year of birth, marital status, parity
PRISM
marital status, age, parity, previous employment status
SAGE
women living with a partner: age, marital status, duration in marital status, parity and age of youngest child
women not living with a partner: age, participation to full education, parity and age of youngest child
multiple births: age (Scott, 2003)
SESIM
Only women who move from their parents can give birth, age 18-49
First birth: age, marital status, pensionable income (quartiles), indicator for market work, highest education.
Following birth: parity (up to 3. 4 is the maximum number of children modelled), age, marital status,
pensionable income (quartiles), indicator for market work, highest education, age of youngest child(Flood et al.,
2005)
Sfb3
Age, marital status, duration of marriage and parity
SOCSIM
monthly rates/ age, sex, marital status, waiting times between events (Hammel et al., 1990)
SVERIGE
marital status, family earnings, education level, working status (working or not), number of live children already
born, whether a birth occurred last year, whether a birth occurred the year before the last, the country of birth of
the mother, number of years that the mother has spent in Sweden (Holm et al.)
Swedish cohort model
age
Source: (O' Donoghue, 2001) and authors mentioned in the table
Table 2: Mortality covariates in selected dynamic microsimulation models
Mortality Covariates
Anac Model
age, sex
Belgian Dynamic model
age, sex
Camsim
age, sex (Smith, 1987)
CORSIM
Demogen
Age, birth place (US or other), education, employment status, family income, marital status, sex, race, marital
status
age, sex + health module
Destinie
age, age at end of education, sex, disability (with disability module)
DYNACAN
gender, age, education, marital status, employment status, disability status, time, region
Dynamite
age, sex
DYNAMOD
age, sex, disability
DYNASIM II
Married women 45-64: age, race, sex, marital status, education, number of children
others: age, race, sex, marital status, education
DYNASIM III
Three equations; time trend from vital statistics 1982-97; includes socio-economic differentials; separate process
for the disabled based on age, sex and disability duration derived from (Favreault and Smith, 2004a)
Famsim
age, sex, marital status (Spielauer and Vencatasawmy, 2003)
Harding
age, sex, education status (Harding, 1993)
Irish dynamic microsimulation model
Age, occupational group, gender, education
APPSIM. Modelling fertility and mortality
Model
7
APPSIM Working paper no. 5
8
Italian cohort model
age, sex
Japanese dynamic model
age, sex
kinsim
age, sex (Imhoff and Post, 1997)
Biological models
age, sex (Hyrenius and Adolfsson, 1964; Léridon, 1977)
LIFEMOD
age, sex
LifePaths
age, sex, [for one study and for elderly people, disability status, living in nursing home or not]
Microhus
age, sex
Midas
age, ethnicity, gender
MINT
age, time, ethnicity, education, marital status, permanent income, disabled
MOSART
age, sex, disability, marital status, educational attainment (Fredriksen, 2003)
Nedymas
age, sex, marital status
Pensim/2
age, gender
PRISM
disability status, age, sex, years of disability
SAGE
age, sex, social class (Scott, 2003)
SESIM
age 0 to 29 : sex and age
age 30 to 64: sex, age, indicator for early retirement, pensionable income (quintile), marital status
age 64+: sex, age, indicator for early retirement at 64 years of age, marital status, highest level of education.
(Flood et al., 2005)
Age, sex, family status
Sfb3
Age, sex, marital status, family earnings, education level, working status(Holm et al. )
SVERIGE
Swedish cohort model
age, gender
Source: (O' Donoghue, 2001) and authors mentioned in the table.
APPSIM. Modelling fertility and mortality
9
2.1 Destinie (France)
The dynamic microsimulation model DESTINIE has been built by the French bureau
of statistics (INSEE) to analyse the long-run situation of pensioners — accounting for
the heterogeneity of careers and preferences and for changes in the labour market, as
well as for the demographic structure and retirement rules. In this respect it has been
used to simulate the impact of the 2003 French Pension Reforms on, for example, the
age of retirement and the replacement rate, as well as the long-term number of
retirees and the financial equilibrium of the pension system. The scope of the model
is national: it is based on a representative sample of the whole French population.
Although DESTINIE’s primary focus is the simulation of pensions, the model can
cope with distributional aspects of other public policies — especially if one is
interested in a life-cycle approach. In this regard, one of the future developments of
the model is to implement individual health expenditures in order to analyse the
redistributive properties of the French health care system.
DESTINIE is based on individual data derived from the 1998 Financial Assets Survey
collected by INSEE. The original sample has been reweighted to account for the
demographic structure of the whole French population, as known from the 1999
French Census. Thus, the initial sample of DESTINIE contains about
20 000 households and 50 000 individuals. In order to study intergenerational
relationships, individuals from the sample are connected to each other through the
artificial imputation of kinship ties.
The demographic events simulated are union, breaking-up, birth of a child, death
and migration. The parameters of the equations are adjusted so that the probabilities
predicted by the model fit the long-term demographic projections made by INSEE
and the French Demographic Institute (INED).
The demographic events modelled are assumed to be independent from economic
events, in order to give the model its robustness and to be able to change economic
trends without having to change the behaviour included in the equations of
demographic events. Nevertheless, some socio-economic differentials are included
(through the use of the relative age at the end of education attendance).
The demographic events are simulated in the following order: entry of migrants;
death; then, for those living in couple, the possible union disruption is modelled; for
those not living in a couple, they can be candidates to form a new union; and, lastly,
for those living in a couple, they may have births.
Destinie mainly uses the distribution of age at the end of school attendance as a
proxy for socio-economic status. To take into account the changes over time in the
10
APPSIM Working Paper no 5
length of school attendance according to birth cohort, it is relative age at the end of
school attendance that is used to distribute the population into one of three
categories. These are first, people with short studies (studies shorter by more than
one year of the relative age of the birth cohort); second, medium studies (when
studies are lower or higher by less than one year to the relative age of the cohort);
and then, third, long studies (for those who attend school more than one year above
the average estimated for the cohort).
Fertility
The fertility equation estimates are based on the survey of family history run with
the 1999 census. It is a self-reported survey, with 235 000 women and 145 000 men
aged 18 and over. This survey is a retrospective survey, investigating mainly the
fertility and union history. The probabilities are estimated using events arising in
years 1995-1998 (as the model starting date is 1998).
Probabilities of giving birth are related to age, parity, relative age at end of school
attendance, duration since last birth for parity 2 and over, and duration since the end
of school attendance and since the beginning of the union.
For the 1st birth, two equations are estimated, according to the category of the age at
end of school attendance — one for those whose have studied less than the average
of the cohort and one those with medium and long studies. The covariates are age of
the women and the duration since age at end of school attendance (or beginning of
the union if the couple is formed before the end of school attendance).
For the following parity, different equations are estimated according to whether the
union in which the previous birth arises has been disrupted or not.
If there is no union disruption, the probability for the second birth is estimated
separately by length of studies (short vs. medium/long term) and the covariates are
duration since first birth and age. For the probabilities for parities 3 to 6, the length of
studies has been introduced as a covariate, but separate equations were not useful.
When there is a union disruption since the previous birth, the duration used is not
the ‘duration since previous birth’ but the ’duration since the beginning of the new
union’. The length of studies is not included.
APPSIM. Modelling fertility and mortality
11
Mortality
Each year, each individual in the sample has a probability of dying, according to
his/her age, sex, and age at end of school attendance. In the study of ageing and
disability, the probabilities were additionally altered according to disability status.
2.2 DYNASIM III (US)
DYNASIM 3 is a dynamic microsimulation model designed to analyse the long-run
distributional consequences of retirement and ageing issues. The model simulates
Social Security coverage and benefits, as well as pension coverage and participation,
benefit payments and pension assets. It also simulates home and financial assets,
health status, living arrangements and income from non-spouse family members (coresidents). In addition, it calculates SSI eligibility, participation and benefits.
The DYNASIM 3 input file is based on the 1990 to 1993 survey of income and
program participation (SIPP) panels, a self weighting sample of over 100 000 people
and 44 000 families. They then randomly output families based on the panel-adjusted
average person weight. DYNASIM 3 focuses on nuclear families; subfamilies and
unrelated individuals are treated separately.
The DYNASIM 3 model includes three sectors: demographics, economics and the
taxes and benefits. The computer implementation follows the structure of its
predecessor and includes two separates microsimulation models, the Family and
Earnings History (FEH) model and the Jobs and Benefits (JBH) model. The FEH
model processes the full sample once for each year of simulation simulating
demographic and annual labour force behaviour for each individual on the input file.
The output of the FEH model is a set of longitudinal demographic and labour force
histories that is the input of the JBH model.
Fertility.
The DYNASIM 3 fertility module is designed to produce the age-race-sex population
distribution in any future year so that the number of people at risk of paying Oldage, Survivors and Disability Insurance (OASDI) taxes and/or collecting OASI or DI
benefit is correct. It is also designed to produce the proper distribution of birth
timing and sequencing to use as an input for generating career trajectories, especially
for women. The model uses the maximum amount of information possible on the
nature of the current relationship between women’s characteristics and their
childbearing, to model births both historically (1993-2003) and in the future (from
2004 onward).
12
APPSIM Working Paper no 5
The data used to estimate fertility include the National Longitudinal Survey on
Youth (1979-94), Vital Statistics and the OACT (intermediate assumptions of the
OASDI trustees).
The probability of giving birth in a year is conditional on the marital status of the
woman and the number of children she has. Seven equations are estimated
according to sub-groups allowing the full sets of interaction terms between the
defining variables (marital status and parity) and the independent variables. The
equations are for the following groups: 1. unmarried teens at risk of first births; 2. all
other unmarried women at risk of first births; 3. unmarried women at risk of second
births; 4. unmarried women at risk of third or higher births; 5. married women at
risk of first births; 6. married women at risk of second births; and 7. married women
at risk of third or higher births. For women aged 40 and over, the estimates are
derived from vital statistics to assign an age-race-parity specific probability.
Multiple births are assigned by age and race, and the sex of the newborn by race.
Mortality
DYNASIM 3 predicts death using a four-stage process that relies both on micro-level
data and aggregate data. The micro-level data characterize within-cohort sex group
differentials, while aggregate data capture the age-race-sex-specific trends and level.
Different equations of the individual probabilities of dying are estimated according
to sex and age as functions of individual fixed characteristics and some varying
socio-economic attributes. The vital statistics are used to incorporate a time trend
(1982-97). For those receiving disability insurance, the probabilities are assigned
using the estimates derived from aggregate data.
The last stage calibrates the expected probability of death to achieve the targets
produced by the social security actuaries.
2.3 SAGE (UK)
The SAGE dynamic microsimulation model is designed to provide projections to
inform the development of social policy in Britain for the twenty-first century,
focussing on the implications of population ageing for pensions, and issues
regarding health and long-term care needs. It is a full population model, and covers
demographic processes, education, labour market participation, earnings, pension
accumulation, health and disability, and support networks. The model uses a base
data set derived from the 1991 Census of Great Britain, and uses time-based
processing with an annual cycle.
APPSIM. Modelling fertility and mortality
13
The ordering of events
The demographic changes are generated in a series of modules, starting with
mortality, then fertility, union dissolution, and then defacto union formation and
marriage.
Fertility
Both the fertility and partnership transition modules use the British Household Panel
Survey (BHPS) as the source data to generate the probabilities. The BHPS is a multipurpose longitudinal study that began in 1991 with a nationally representative
sample of around 5,500 households and 10,300 individuals.
From the BHPS, one partnership and fertility person-year file was created using
records from 1992 to 1999. Although the woman’s parity is known on the BHPS, only
the observations of the number and ages of her children aged under 16 and currently
in the household (including natural, step and adopted children) at each interview
were used as predictor variables, in order to be consistent with the data available
from the Sample of Anonymised Records of the UK 1991 Census, used as the base
data.
Women who had been widowed in the last year were excluded from the fertility
module. Two predictor equations for the likelihood of giving birth according to
partnership status are estimated. For women living with a partner, an interaction
effect between the woman’s marital status and the duration of her cohabitation with
her partner was accommodated by creating one composite variable. The same
approach was adopted to incorporate an interaction between the number of children
and the age of the youngest child. Participation in education was strongly negatively
associated with the likelihood of a woman not living with a partner giving birth in
the next year. When the other factors were controlled for, neither ethnic group nor
social class was significantly associated with fertility.
Once a woman has been selected to give birth in the coming year a look up table,
based on rates of multiple births in Britain in the early 1990s, is used to select women
for multiple births according to their age. The maximum number of multiple live
births that can be allocated is triplets.
The allocation of the sex of a child follows the sex ratio at birth in Britain of around
1.05 male to female babies. The probability of a new baby being a boy is therefore set
at 0.512
14
APPSIM Working Paper no 5
Mortality
The annual probability of death for individuals depends on their age, sex and social
class. These probabilities are based on abridged life-tables for the years 1987-1991
produced by the Office for National Statistics (ONS) using the ONS Longitudinal
Study (LS). The LS is a dataset produced by record linkage of Census and vital event
information for individuals living in Britain. Then the grouped probabilities are
converted to single year probabilities, using the vital registration data of death rates
by single age and sex in 1991.
2.4 LifePaths (Canada)
LifePaths is a microsimulation model designed to simulate life histories, taking
account of birth, death, immigration status, inter-provincial migration, marital
history (including common-law unions), educational history, employment history
and the birth and presence of children at home. It is used to analyse government
policies having an essentially longitudinal component and whose nature requires
evaluation at the individual or family level (such as post-secondary education costs
and benefits or public pension sustainability). It can also be used to explore a variety
of societal issues of a longitudinal nature, such as intergenerational equity or time
allocation over entire lifetimes.
LifePaths models individuals from their birth. When creating a new individual, place
of birth (in a specified province or territory, or outside Canada), sex and date of birth
are assigned. If they were born outside Canada, their province of entry to Canada
and age of immigration are also assigned. Given that the earliest birth occurs in 1872,
LifePaths can simulate a complete population aged 0 to 99 starting in 1971. By
default, LifePaths simulations closely approximate official estimates of the Canadian
population by age, sex and province in each year over the period 1971 to 1999, as
well as Statistics Canada’s medium population projection over the period 2000-2026.
The population represented does not include non-permanent residents, a large
number of which are foreign students.
Fertility
The historical series of births by sex and province that are used to initialize
simulations were derived from the 1911 and 1921 Censuses together with annual
birth registrations between 1921 and 1998, as well as from immigration records. For
the period prior to 1921, the 1921 Census population was ‘reverse survived’: the
number of births that would have had to occur to produce the observed number of
survivors in 1921 was determined using mortality probability estimates. For 1921 to
1971, historical population estimates by age, sex and province were used. Special
APPSIM. Modelling fertility and mortality
15
estimates for Newfoundland were derived using the 1911, 1921 and 1935 Censuses of
Newfoundland.
For the period 1972-1999, the official population estimates were used. These were
available by age, sex and province and consist of the following components: total
population, non-permanent residents, immigrants, emigrants, returning Canadians
and internal migrants. For the period 2000-2026 the Statistics Canada medium
population projection was used. Historical lifetime internal migration data was
based on decennial Census data from 1911 to 1971. Base probabilities of individuals
moving between each pair of provinces were derived using the Family AllowanceChild Tax Credit data (1972-1996). Estimates of family level migration probabilities
were obtained (using the base probabilities as benchmarks) from 1991 and 1996
Census data on place of residence one year prior to the Census.
Mortality
Mortality in the LifePaths model is determined by re-assessing the chance and timing
of death at each birthday. The chance of dying is based on the age-specific mortality
probability of Canadians sharing the sex and year of birth of the simulated
individual. The process of re-assessing mortality continues until death or until the
individual reaches the maximum allowed age of 115 years.
Individuals who are destined to immigrate to Canada are not exposed to a risk of
death until they arrive. This avoids simulating individuals who then die before
reaching Canada and who would, therefore, make no contribution to the simulation
reports. This practice gives recognition to a reality: the appropriate mortality risk for
this sub-population is unknown.
The historical cohort mortality data were derived from death registration statistics.
The future mortality of recent cohorts was projected using mortality assumptions
from the Statistics Canada medium population projection for the period 2000-2026.
2.5 DYNAMOD II (Australia)
DYNAMOD-2 is a dynamic microsimulation model of the Australian population that
is designed to project characteristics of the population over a period of up to 50
years. The model operates with a 1 per cent sample of the Australian population —
about 150 000 records in the starting population — and generates lifetime profiles of
people in terms of demographic events (fertility, mortality, couple formation and
dissolution, and overseas migration), education participation and attainment, labour
force activity and earnings.
16
APPSIM Working Paper no 5
DYNAMOD uses survival functions to model demographic events. These survival
functions operate with a period of one month (pseudo-continuous time, compared
with annual models).
Fertility
Fertility is modelled in DYNAMOD-2 through pregnancy and childbirth events for
women. Once the woman is aged 15 or a birth has occurred, the model predicts the
date of the next childbirth for a woman. Once the date of birth is known, a pregnancy
start date is determined. Once pregnancy begins, only the death of the mother can
stop a child being born. The model simulates only live births but it retains the
possibility of confinements resulting in multiple children (twins or triplets) being
born. Three distinct types of birth — premarital, first marital, and second and
subsequent marital births. The more general definition of marital status is used in the
model, where it refers to both legal (de jure) marriages and de facto relationships.
Thus, premarital births refer to births among women who have never been married or
entered into a de facto relationship. First marital birth refers to the first live birth after
entry into the first marital union, while second and subsequent marital births refer to all
births after the first. Survival functions are used to predict the time until occurrence
of pregnancy among women aged between 15 and 49. The estimation of the time
until childbirth for the three types of childbirth depends on educational
participation, educational qualifications, employment status and age of the women.
First and subsequent marital births are influenced by additional characteristics
including marital status, employment status of husband, duration of the marital
relationship, parity and duration since previous birth.
Survival functions were estimated from the “negotiating the life course” survey. It is
a panel conducted by the Australian National University which covers 2547 women
aged 20–59 years at the 1st wave and collects historical information on the women
and their spouses covering the period 1971–86.
Mortality
Mortality rates are influenced by four factors in DYNAMOD-2 — gender, age, year
of birth and disability status. Those for the disabled are derived from the residual
rate when the able-bodied rate is removed from the overall mortality rate. Mortality
rates for the population as a whole are targeted to conform to ABS projections, while
mortality rates for the able-bodied are derived from ABS mortality rates on death
due to external causes. Mortality rates for the disabled are calculated based on the
difference between these two rates. The model incorporates mortality improvements
during the forecast period. The assumption is that improvements in mortality for the
APPSIM. Modelling fertility and mortality
17
able-bodied are half those for the disabled. Improvements in overall mortality rates
in the model are based on ABS improvement projections.
2.6 MOSART (Norway)
The development of the first version of the MOSART model started in 1988. The first
version comprised demographic events, education and labour force participation.
The second version extended the model with public pension benefits and labour
market earnings and in the third version, the household formation and non-earnedincome, taxation, savings and wealth were added.
The starting population of MOSART is a 12% sample of the population in Norway in
1993. This initial population gathered information on marriage, birth histories,
educational level and activities, pension status and pension entitlements in the
National Insurance Scheme, rehabilitation schemes, special pension entitlements for
civil servants, wealth and household status. The latter includes to some degree
relations to spouses, parents and children. Most of the information is represented as
annual data back to 1985, and pension entitlements are represented as annual labour
market earnings back to 1967. The information is gathered from registers run by the
Directorate of Taxes, the National Insurance Administration and Statistics Norway.
Fertility
Births are simulated for women with fertility rates from 1989, depending on age,
number of children and age of the youngest child. Each time a birth occurs, a child is
added to the model population. In the simulation of historical data, fertility rates
from 1989 are adjusted proportionally each year to let the simulated number of births
be equal to the actual level each year. This adjustment is roughly the same as
adjusting the fertility rates against the periodic total fertility rate.
Mortality
Mortality depends on gender and age, and the baseline mortality rates from 1989 are
adjusted proportionally, corresponding with life expectancy at birth each year.
Furthermore, mortality is higher in the simulation for single, disabled and those with
a low education level. The covariates are roughly estimated from various sources of
mortality statistics, and included because these differences are of significance for
pension expenditure. Simultaneously the base line mortality is adjusted to ensure
that these covariates do not influence the average mortality by gender and age.
18
APPSIM Working Paper no 5
2.7 SVERIGE (Sweden)
The SVERIGE spatial microsimulation model is the work of the Spatial Modelling
Centre. It is based on the CORSIM model but has added time, geography and agentbased modelling.
Fertility
Fertility behaviour is influenced directly in the model by a number of individual and
family attributes generated in some other modules, such as employment and
earnings, education and marriage. The probability of having a baby is determined
every year for all females between age 15 and 44 inclusive. Two logistic regressions
are estimated separately for those who are married and those who are not.
The equations are developed on a sample of 100 723 women aged 15-44 from the
Swedish population in 1990. The covariates are age, education, income, labour force
participation and, for the unmarried women, whether living as part of a couple.
Mortality
The estimations of the equations are based on a sample containing 458 000 Swedes in
1990. These individuals were followed over a five-year period (1990-1995) for
occurrence of death.
Different equations have been estimated according to age. For persons under 25
years old, an exponential time trend equation of the period 1985-1995 was estimated
with three parameters α1, α2 and α3. α1+ α2 equals the 1985 mortality rate and α3 is
negative to take into account the decline over time of the mortality rates. The
estimation is done separately for males and females and according to 6 age-groups
(infant mortality, 1-4, 5-9, 10-14, 15-19 and 20-24).
For individuals over 25 years old, the probability of dying is estimated for a five-year
period and then converted to a one-year probability. Three equations corresponding
to three age-groups are determined — working age population (25-59 years old);
young old (those aged 60-74) and the older old (those aged 75 years and over). The
covariates are age, education level, and marital status for all age groups. For those of
working age, labour force participation and the level of income are also included.
2.8 SESIM (Sweden)
SESIM is a dynamic microsimulation model developed by the Swedish Ministry of
Finance in collaboration with researchers from different universities. The project
APPSIM. Modelling fertility and mortality
19
started in 1997, and SESIM’s first mission, an evaluation of the Swedish national
system of study allowances, took place in 1998. Since the year 2000 the focus has
shifted from education to pensions. To evaluate the financial sustainability of the
new Swedish pension system is a major purpose of SESIM. This new focus has also
implied that SESIM has been developed into a general microsimulation model that
can be used for a broad set of analyses.
A sample of some 100,000 Swedish citizens is used as a simulation base. The data is
sampled from the 1999 year wave of LINDA, a large longitudinal dataset. Additional
observations are sampled from the set of all individuals living abroad having
Swedish pension rights.
Fertility
Women aged between 18 and 49 years old and not living with their parent can have
a child. The probability of having a first child depends on age, marital status,
pensionable income (quartiles), indicator for market work, and highest education
level. For the parity up to 3 (no women has more than 4 children), different equations
were estimated using the same covariates, plus the duration since the previous birth.
Mortality
Three different set of equations have been estimated according to broad age groups
— those less than 30, the 30-64 year olds, and the 65 years and over. For the 0-29 year
olds, the probabilities are determined by age and sex. For the 30-64 year olds, some
variables related to labour force participation and income are added (indicator of
early retirement, quintile of pensionable income) as well as marital status; for those
aged 65+, the relevant variables are age, sex, marital status and highest level of
education.
3
Structure of Demographics in APPSIM
After reviewing the starting population chosen for APPSIM and the proposed order
of events in the model and in the demographic module, this section describes the
demographic module of APPSIM and how the different equations have been
estimated.
20
APPSIM Working Paper no 5
3.1 Starting population
After reviewing potential datasets, it has been decided that the starting population
for APPSIM will be based on the 1 per cent Census Sample, drawn from the 2001
Census (Kelly, 2007). It consists of a sample of 1 per cent of private dwellings, with
their associated family and person records, and a 1 per cent sample of persons from
all non-private dwellings together with a record from the non-private dwelling (ABS,
2001) — that is 75,451 dwellings, 79,320 families and 188,013 persons. The 1 per cent
census has the advantage of gathering information on both private and non-private
dwellings, and has a big sample size. The main drawback is that the number of
variables is limited and imputations must be performed to be able to run
simulations.
3.2 Order of the modules
During each period of time, each individual goes through all the different groups of
events. This includes the set of demographic events (including household formation),
education, labour force participation, earnings, housing, and financial events such
as unearned income , household expenditure, taxation, and finally welfare (with
an emphasis in APPSIM on health care and aged care).
Figure 1: The different modules in APPSIM
Health & Aged Care
New
Demographics
Year
Household Formation & Movement
Taxation
Social Security
Education & Training
Labour Force
Household Assets & Debt
Other Income & Expenditure
Earnings
Housing
APPSIM. Modelling fertility and mortality
21
As you can see, demographics are the first group of events. Because of the model
structure, that means that we assume that demographic events can affect the other
events that happen in the same year — but the later events, such as labour force
status and earnings, can affect demographic events only in the following year. For
example, this means that having a baby in year ‘t’ can affect the probability of labour
force participation in that year — but being in the labour force in year ‘t’ can only
affect the probability of having a baby in year ‘t+1’.
3.3 Order of the events in the demographic module
After an overview of the different events included in the demographic module, we
will focus on the data and estimates for fertility and mortality.
Figure 2: The events and their order in the demographic module
Start of
year
Population (1)
Population (5)
Immigration
Emigration
Population (2)
Population (4)
Mortality
Fertility
Population (3)
The proposed order of events in the fertility, mortality and migration modules is
summarised in Figure 2. In dynamic microsimulation models such as APPSIM, we
apply probabilities to determine whether each event will happen or not — for
example, every year we will test whether a person will stay alive or will die (or will
emigrate or have a child).
22
APPSIM Working Paper no 5
The probability that an event arises is estimated using the most detailed data
possible and whenever possible using longitudinal data — as this allows
determination of transitions between different states. For example, the probability of
divorce is more related to the number of years spent in marriage than to age.
The order of the variables is not neutral, as if we model first mortality, a person
selected to die cannot emigrate nor have a child later on. The order of the modules
must thus reflect a logical interaction pattern and the probabilities of each event
must be conditional probabilities taking into account the implication of the order.
In microsimulation models, each event is modelled in a more or less dependent way.
While projections which run using a cohort-component method use the average
population, and not the population at the beginning of the projection step for events
such as fertility, (thus being able to take into account the fact that more than one
event can arise during the year), in microsimulation, each event arises
independently. If death is the first event then, for example, any women selected to
die will not be able to give birth. We impose a causality in the model that has to be
taken into account in the estimate — at least when the probability that both events
arise during the same period of time is not negligible.
The order proposed here is first, immigration, then mortality, fertility and
emigration. With this order, we can take into account:
•
Immigrants are alive when they arrive but can die and have children as soon
as they arrive on Australian soil;
•
Mortality before fertility: this assumption reduces the number of orphans at
birth but, in a country like Australia, it is fortunately a quite rare event to
have the death of a woman in the year of the birth of her child. These two
events are estimated independently — that is, the risk of mortality is not
dependant upon whether a person will have a baby. Therefore, this is a less
risky assumption in many respects, as estimating births after mortality will
reduce orphanhood in the first year of life, but the error will not be great and
we assume this error to be better than the reverse. The other reason for using
this order is that the data used to determine fertility estimates are based on
surviving women.
•
Emigrants are alive at the moment of their departure but could have had a
child just prior to emigration.
This order presents a drawback related to the fertility and mortality of migrants. If
we assume like all projections that immigrants and emigrants stay on average half
the projection period in the country, death and fertility rates of migrants are half
those who are staying the whole year in the country. While it is possible to have a
variable distinguishing immigrants and applying half the rates to them, it is not
APPSIM. Modelling fertility and mortality
23
possible to do the same for emigrants because we don’t know where they are when
fertility and mortality occur. It can lead to a small increase in mortality and fertility.
It is possible to avoid this problem by using the following method1. Assume that half
the immigrants will arrive at the beginning of the period and half of them will arrive
at the end of the projection period. Those arrived at the beginning of the projection
period will face the whole year the events death and births while those arriving at
the end of the period will not face any event. The same way for emigrants, we
consider that half of them will leave the country at the beginning of the projection
period and therefore will not face any other events in the country. The other half of
the emigrants will leave at the end of the period and will face the other events during
the whole year.
This shortcut is possible with the assumption that they stay on average half the
period; instead of all the immigrants and emigrants facing the events with half the
probabilities because they stay half the year in the population; we will have half the
immigrants and emigrants with the full rate.
4
Mortality in APPSIM
4.1 Mortality
No data are available within Australia according to socio-economic level, disability
status, occupation foreign vs. Australian born etc at individual level - only at SLAs
(statistical local area).
The first version of mortality within the model is therefore very simple as it uses
mainly the probability of dying by age and sex, determined by ABS for their
population projections, medium scenario (ABS, 2004 and special communication).
We are still investigating how we can include some differentials in the model.
The ABS data are up to age 100+, but we wanted to increase the upper age-group,
given the likely increase in the number living past 100 years. Based on the work by
1 The authors would like to thank Laurent Toulemon for this very pertinent suggestion.
24
APPSIM Working Paper no 5
Thatcher-Kannisto2, we have mortality rates from 100 to 105+. We have used this
distribution by age to determine the distribution by age over 99 from 2005 to 2050
(Pennec and Bacon, 2007).
4.2 Calibration/Alignment
One of the issues that has received much attention during the past decade is the need
to align summed micro estimates to benchmark data sources that are regarded as
reliable within a country (such as official population projections) (Anderson, 2001).
This topic is more fully developed in another paper (Bacon and Pennec, 2007). We
note here the aggregates and benchmarks that we should be able to ‘hit’ with the
fertility estimates in the APPSIM model
Macro environment
Health &
Aged Care
New
Year
?
Demographics
Household
formation &
movement
Taxation
Social
Security
Education &
Training
Household
Assets &
Debt
Labour Force
Other income
&
expenditure
Earnings
Housing
V0.2 8 Feb 2006
The number of births ought to align with the aggregate number of births based on
fertility rates. In theory, it is desirable that the model outcomes also align with
external benchmarks by age of the parents, number of births by the mother (first
child, second child, etc), marital status of the parents, and aggregate number of
multiple births (sets of twins, etc).
2 We accessed to the Kannisto-Thatcher database on old age mortality (K-T database),
maintained by the Max Planck institute for demographic research.
APPSIM. Modelling fertility and mortality
25
The fertility rate should reflect historical rates where available and the projected
rates (general fertility rates and age-specific rates) should be a user-defined input
parameter.
Different sources of information will be used in order to check and calibrate our
estimates. Among these are the vital statistics, some surveys like the ABS survey on
family characteristics and the ABS population projections.
One proposal for this micro-macro linkage when demographics are concerned is to
link a cohort component model to the microsimulation model. Our idea is not only to
align the microsimulation results to the “official results” but to built a cohort
component model that take into account certain specificities of microsimulation like
(1) the events are determined according to the age at the beginning of the simulation
step that could not be equal to the age at the event; (2) rates are not applied to the
mean population but to the population as it is when this event is calculated.
This approach leads to a real consistency of the macro and micro levels; gives a better
consistency to the user when he/she wants to use some other scenario e.g even if the
user wants to increase only the fertility, it has effects on the number of deaths so this
method recalculates all events involved.
5
Fertility in APPSIM
5.1 Data on fertility
5.1.1 Vital statistics
Births and confinements are registered by different bodies — the different State
Registrars and the Australian Institute of Health and Welfare through the registers of
midwifes. These two series present discrepancies, as a time lag exists between
confinements and registration. The time difference differs from state to state and
according to year. In 2003, the Perinatal Data Collection reported the occurrence of
255,100 live births in Australia, 1.6% more than the 251,200 births registered in the
same year (ABS, 2004); (McDonald, 2005).
26
APPSIM Working Paper no 5
The contents of the birth registration forms differ by state but, nevertheless, most of
the variables are the same in all the states. One difference lies in the parity of the
child; in some states, the parity of the child within the overall number of children of
the woman and within the current relationship is available (Queensland, South
Australia, Western Australia and Tasmania) whereas, in the other states, only the
latter is available.
5.1.2 Specific surveys
•
The first possible source, one that was used for the DYNAMOD model, is the
ANU survey “Negotiating the life course”. It is a panel of 2 247 persons designed
to study the interaction between family life and labour force participation.
•
The second and more recent survey is the Household, Income, and Labour
Dynamics in Australia survey (HILDA). HILDA is a broad social and economic
survey. As it has a longitudinal design, most questions are repeated each year. In
addition, each year a special topic is covered — such as in wave 1 the family
background, in wave 2 the household wealth, and in wave 3 retirement and
plans for retirement. Private health insurance and youth are covered in wave 4.
The panel began in 2001 with a national sample of Australian households
occupying private dwellings of 6,872 households and 13,969 individuals.
Members of the original survey in 2001 have been traced and interviewed
annually, along with members of their new households.
5.1.3 Census
The quinquennial census is another useful source of data for fertility analysis. For
some years, some specific questions on the total number of children ever born were
added to the usual question of the dependent children living in the household. Some
interactions of demographic variables with housing, economic, and labour force
characteristics can be studied using the census data. A major advantage of the
census, like vital statistics, is that it is an exhaustive source — but the number of
variables is limited, as is the level of detail for each topic.
Vital statistics and census data will be used mainly to benchmark the APPSIM
estimates, as our estimates need to rely more on micro-level data able to report
interactions between variables. To determine the fertility estimates, we chose to use
the HILDA panel data. The reason for this choice is that it contains both longitudinal
historic data and cross-sectional data and its sample size is bigger than the ANU
survey. One set of benchmarks to check the summed micro fertility estimates against
is the ABS population projections.
APPSIM. Modelling fertility and mortality
27
5.1.4 HILDA
For each responding person, for each year, a set of questions are asked about the
children currently living in the household and those not living in the household —
and another set are asked to elicit the likelihood of more children arriving in the near
future. According to the place of residence of the child, questions on the involvement
of the parent not living with the child, if applicable, are registered; child care and
financial aspects are major topics investigated with this questionnaire. A copy of the
fertility section of the HILDA survey is attached in Appendix A.
Fertility can be studied according to different approaches: a retrospective
longitudinal approach (2) that looks at the past fertility of the different women and a
cross-sectional approach (1) that looks at fertility at a point of time.
Figure 3: Different approaches which could be used to analyse fertility with the
HILDA data
Age
o
n
n
n
n
w ave 1 w ave 2 w a ve3 w a ve4
Year
With the retrospective approach, for each woman, we can reconstitute her fertility
history and, by aggregating the individual records by birth cohort, we can see the
trends that affected the fertility level of earlier decades and can have a better insight
into the future (as we can compare the behaviour of different cohorts at the same age
and deduce from that what is likely to happen in the future).
On the other hand, with the cross-sectional approach, we have the current fertility
patterns — and the design of HILDA is such that the key questions about economic
characteristics and labour force participation are only available at the date of the
interview, not for many years earlier as with the retrospective perspective.
28
APPSIM Working Paper no 5
5.2 Fertility in HILDA
As already mentioned, it is important that the projections performed with APPSIM
give coherent results, at both cross-sectional and longitudinal levels. Our estimates
must be as close as possible each year to the right total fertility rate, so that we will
have the right number of births — but we also have to pay attention to the completed
total fertility rate (that is, the fertility at birth cohort level). These two figures are both
averages — but another important aspect of fertility modelling at an individual level
is to obtain a good distribution of the number of children per woman. One aim of
APPSIM is to study ageing, and in ageing we can also include disability and need of
care, formal or informal. We know that informal care is provided mainly by the
spouse and the children, so it is crucial to have a good representation of the
distribution of children and, in particular, childless persons. Childless persons and
those with no disability-free spouse or children rely almost entirely on formal care.
The timing of childbearing can also be of importance, for labour force participation
of women, for education cost, and for caring when old age of the parents requires it.
5.3 Fertility estimates
5.3.1 Fertility estimation processes
To estimate fertility, we follow an approach similar to the SAGE model as we have a
similar dataset (Scott, 2003).
From its fertility module, HILDA allows us to use variables from both retrospective
variables and current status, or status last year, as predictors of whether the woman
will have given birth by the time of the next annual interview. HILDA combines both
a retrospective approach at first interview and a panel approach.
From HILDA, a person-year file was created using records from 2001 to 2004. To
generate a sufficient sample size these pooled records were then treated as
independent. Table 3 gives the number of births registered in each wave. The eligible
sample was restricted to records for women aged between age 15 and 49. Women
who had a child that died have been dropped from the analysis because as the date
of birth of this child is unknown and therefore some of the intervals between births
APPSIM. Modelling fertility and mortality
29
are incorrect and these cannot be identified3. Cross sectional weights provided with
the HILDA data were applied to the pooled records to adjust for unequal
probabilities of response.
3 This limit will be able to be removed in the future, as in a future wave of HILDA, questions
on children who have died will be asked.
30
APPSIM Working Paper no 5
Table 3: Number of births by parity in the different waves of HILDA.
Number of 1st
Number of 2nd
Number of 3rd
Number of 4th
births
births
births
and
All births
subsequent
births
Wave1
88936 (92)
90262 (85)
38902(45)
19493(28)
237593(250)
Wave2
88194(68)
86100(76)
30499(34)
20199(25)
224992(203)
Wave3
134581(95)
110548(86)
34482(34)
24817(18)
304428(233)
Wave4
107136(98)
79453(61)
36477(31)
25716(21)
248782(211)
All 4 waves
418847
366363
140361
90225
Note: The figures within parentheses are the unweighted figures.
Due to the retrospective recollection of the data, births are registered after they occur,
whatever the approach chosen. This is obvious for retrospective longitudinal
approaches as we reconstitute the fertility biography of the women interviewed in
2001 for instance, but it is also the case even for the panel and cross-sectional
approaches. In wave x, the births we observed are those that have occurred since the
previous wave — e.g. the children aged 0 in wave 3 are the babies born between the
30th June 2002 (wave 2) and the 30th June 2003 (wave 3)4. Therefore, to determine the
probability of having a child, we need to use the situation of the woman at the 30th
June previous to her childbearing — i.e. for the mother of the baby born between
wave 2 and wave 3, we have to use the value of the variables at wave 2. This means
that, as illustrated in the Figure below, for the babies born in wave 1 (2001), we have
to reconstitute the variables as if there will be a wave “0”5.
The first version of the fertility estimates developed here are mainly demography
based — that is, mainly demographic variables are used along with quite simple
4 Some corrections to the file have been performed in order to create the age and number of
children at the 30th June previous to the interview, rather than at the date of the interview.
5 For the first cut of the estimates, we assume that the value of variables education, labour
force status and marital status remain the same in wave 0 as their values were in wave 1.
This is a strong assumption, in particular for young persons, but it will be corrected in the
future.
APPSIM. Modelling fertility and mortality
31
education and labour force status variables. A future model could include more
detailed demographic variables (such as duration since union) and some socioeconomic context variables (like education and labour force status of the spouse,
level of income, effect of being in a new relationship, duration of cohabitation with
partner, etc). This first version will also rely only on a cross-sectional approach, using
limited duration variables. A second version could try to capture more of the longterm trends that cannot be replicated with a 4 year long panel.
One constraint for the first version is to use in the estimation variables that are
available in the 1 per cent household census sample, or relatively easy to impute.
(The 1 per cent census sample is the core dataset chosen as the starting population of
the microsimulation model APPSIM.)
To produce the probabilities of having a child during the following year, a set a
logistic regressions have been performed. These regressions have been used to check
the significance of the variables and to determine the parameters of the different
covariates. A different equation is estimated by parity — and, for the probabilities of
having a first or a second baby, a second level of stratification of whether the women
is living with a partner or not has been introduced. Table 4 gives the different
variables and covariates chosen and tables 5 - 10 give the results of the logistic
regressions. The variables included in the estimation are demographic variables,
education variables and labour force variables.
32
APPSIM Working Paper no 5
Table 4: Variables and categories used in the regression models
Name of the variable
Categories
Age
Single year of birth
Migrant status
Born in Australia (0)
Born overseas (1)
Current marital status
Married (1)
De facto (2)
Separated, Divorced, Widow(ers) (3)
Never married (4)
Couple
Not living with a partner (0)
Living with a partner (1)
Education level
High (post graduate-master, doctorate;
grad. diploma, grad certificate; adv
diploma, diploma; cert III or IV) (1)
Low (cert I or II; cert not defined; year
12; year 11 or below; undetermined) (2)
Labour force participation status
Full-time (1)
Part-time (2)
Not working (3)
Full-time student (4)
Duration since last birth (is the
equivalent of the age of the youngest
child)
Years
Age of the mother at 1st birth
15-24 years old (1)
25-34 years old (2)
35 years old and over (3)
APPSIM. Modelling fertility and mortality
33
5.3.2 Results of the fertility estimation
Table 5: Coefficients of the logistic regression of giving birth to the first child
for childless women not living with a partner
Name of the
variable
Categories
Intercept
Age
Single year of birth
Age2
Parameter
estimate
Level of
significance
-6.7438
***
0.1384
***
-0.00293
***
Migrant status
Born in Australia
Born overseas
0.7998
***
Current marital
status
Sep., Div., Wid.
Never married
0.3547
***
Education level
High
Low
-0.1862
***
Labour force
participation
status
Full-time
Part-time
Not working
Full-time student
-0.4925
0.7315
1.9437
***
***
***
Note: (1) *** for <0.0001; ** for <0.001; * for <0.01; the reference categories are in bold
Table 6: Coefficients of the logistic regression of giving birth to a first child for
women living with a partner.
34
Name of the
variable
APPSIM Working Paper no 5
Categories
Intercept
Age
Single year of birth
Age2
Parameter
estimate
Level of
significance
-10.7588
***
0.6681
***
-0.0119
***
-0.1576
***
0.6062
***
Migrant status
Born in Australia
Born overseas
Current marital
status
Married
De facto
Education level
High
Low
-0.0925
***
Labour force
participation
status
Full-time
Part-time
Not working
Full-time student
-0.5542
-0.3247
1.6710
***
***
***
Note: (1) *** for <0.0001; ** for <0.001; * for <0.01; the reference categories are in bold
APPSIM. Modelling fertility and mortality
35
Table 7: Coefficients of the logistic regression of giving birth to a
second child for women having one child not living with a partner
Name of the
variable
Categories
Intercept
Age
Single year of birth
Age2
Migrant status
Born in Australia
Born overseas
Current marital
status
Sep., Div., Wid.
Never married
Education level
High
Low
Labour force
participation
status
Full-time
Part-time
Not working
Full-time student
Duration since
last birth)
Years
Parameter
estimate
Level of
significance
-3.4698
***
0.1325
***
-0.00393
***
0.2362
***
-0.0509.
***
0.0157
*
-1.008
-0.5016
0.8455
***
***
***
0.0569
***
Note: (1) *** for <0.0001; ** for <0.001; * for <0.01; the reference categories are in bold
36
APPSIM Working Paper no 5
Table 8: Coefficients of the logistic regression of giving birth to a second child
for women having one child and living with a partner
Name of the
variable
Categories
Intercept
Age
Single year of birth
Age2
Parameter
estimate
Level of
significance
-9.8651
***
0.5310
***
-0.00904
***
Migrant status
Born in Australia
Born overseas
0.1230
***
Current marital
status
Married
De facto
0.3015
***
Education level
High
Low
0.1142
***
Labour force
participation
status
Full-time
Part-time
Not working
Full-time student
-0.0247
-0.6572
0.6814
***
***
***
Duration since
last birth)
Years
-0.0180
***
Age of the
mother at 1st
birth
15-24 years old
25-34 years old
35 years old and over
0.4307
0.0959
***
***
APPSIM. Modelling fertility and mortality
37
Note: (1) *** for <0.0001; ** for <0.001; * for <0.01; the reference categories are in bold
Table 9: Coefficients of the logistic regression of giving birth to a third child
for women already having two children
Name of the
variable
Categories
Intercept
Age
Single year of birth
Age2
Parameter
estimate
Level of
significance
-12.0809
***
0.6435
***
-0.0119
***
Migrant status
Born in Australia
Born overseas
0.2748
***
Current marital
status
Married
De facto
S
Di
0.2415
0.3756
0 5683
***
***
***
0.0279
***
-0.2703
0.3276
0 4466
***
***
***
Education level
High
Low
Labour force
participation
Full-time
Part-time
N
ki
Duration since
last birth)
Years
Wid
0.1410
***
Age of the
15-24 years old
-0.2804
st
mother at 1
25-34 years old
0.0313
35
ld d
bi h
Note: (1) *** for <0.0001; ** for <0.001; * for <0.01; the reference categories are in bold
38
APPSIM Working Paper no 5
Table 10: Coefficients of the logistic regression of giving birth to an additional
child for women having three or more children
Name of the
variable
Categories
Intercept
Age
Single year of birth
Age2
Migrant status
Born in Australia
Born overseas
Current marital
status
Married
De facto
Sep., Div., Wid.
Never married
Education level
High
Low
Labour force
participation
status
Full-time
Part-time
Not working
Full-time student
Duration since
last birth)
Years
Parameter
estimate
Level of
significance
-4.2654
***
0.2043
***
-0.00581
***
0.7998
***
-0.5091
0.4869
0.0745
***
***
***
0.3186
***
-0.3506
-0.0344
1.1121
***
***
***
0.0571
Note: (1) *** for <0.0001; ** for <0.001; * for <0.01; the reference categories are in bold
APPSIM. Modelling fertility and mortality
39
5.4 Calibration/Alignment
This topic is more fully developed in another paper (Bacon, 2007). We note again the
aggregates and benchmarks that we should be able to ‘hit’ with the fertility estimates
in the APPSIM model
The number of births ought to align with the aggregate number of births based on
fertility rates. In theory, it is desirable that the model outcomes also align with
external benchmarks by age of the parents, number of births by the mother (first
child, second child, etc), marital status of the parents, and aggregate number of
multiple births (sets of twins, etc).
The fertility rate should reflect historical rates where available and the projected
rates (general fertility rates and age-specific rates) should be a user-defined input
parameter.
Different sources of information will be used in order to check and calibrate our
estimates. Among these are the vital statistics, from surveys such as the ABS survey
on family characteristics and the ABS population projections.
5.5 Implementation in APPSIM
Each women aged 15 to 49, will enter in the fertility loop. Her probability of having a
child each year is related to a set of covariates — the number of children she has ever
had (parity), her age, education level, labour force participation status, migration
status, and for parity 2 and 3, her age at the first birth.
This implementation can be drawn as follows:
Once a women is set to have a birth within the year, some characteristics of the
outcome of the confinement have to determined, such as whether the women gives
birth to one or more children. Next, some characteristics of the newborn have to be
determined; their birth cohort is deduced from the year the birth arises, but the sex
of the newborn needs to be determined and also whether the newborn will survive
to his or her first birthday.
Multiple births
40
APPSIM Working Paper no 5
Once a woman has been selected to give birth in the coming year, we will determine
if this confinement leads to multiple births or not. The probability of a woman giving
birth to more than one baby is based on the average rate of multiple births observed
in Australian in 2000-2005. The maximum number of multiple live births that can be
allocated is triplets. As in other countries, women aged 30 and over are more likely
to have multiple births than younger women. This is related to the ageing of
childbearing and the more frequent use of medically procreative fecundation at these
ages. On average 1.7 per cent of confinements lead to more than one child. Women
aged 30 and over are 1.5 times more likely to have twins or triplets than younger
women. In the model, for those having more than one child, 97.9 per cent will have
twins and 2.1 per cent triplets.
Figure 4: Model of child bearing
Women aged 15-49
how many children ever had ?
0
1
2
birth?
birth?
birth?
age, parity, marital status,
labour force participation
status, migration status,
education level
age, parity, marital status, labour
force participation status,
migration status, education level,
duration since previous birth, age
mother at 1st birth
age, parity, marital status, labour
force participation status,
migration status, education level,
duration since previous birth, age
mother at 1st birth
no
yes
no
3
birth?
age, parity, marital status,
labour force participation
status, migration status,
education level, duration
since previous birth
no
yes
yes
Single or multiple birth
in the confinement?
age
one
multiple
twin or triplets?
sex of the
newborn
Infant
mortality
age, sex
no
yes
APPSIM. Modelling fertility and mortality
41
Sex of the infant
The allocation of the sex of a child follows the sex ratio at birth in Australia of around
0.512 (105 boys for 100 girls) male to female babies. This ratio will be replicated in
APPSIM, with this choice being influenced by the fact that APPSIM is a national
model which doesn’t incorporate the State level. (If State and Territory were
included within APPSIM, then the sex ratio should be state-related, as there are
differences according to states in the sex ratio.)
Infant mortality
The probability of death between the birth and the end of the projection period (here
the financial year) is given in the projected ABS population projection by sex. The
death rate is 4.92 per cent for boys and 4.59 per cent for girls in 2004.
6. Conclusion
As with any module in APPSIM, this demographic module is under continuing
development. This document gives the first cut of the demographic module. As it is
the first to be built it will be fine tuned and re-estimated as new modules are
developed and as new data are available. In particular, we will use the new waves of
the HILDA panel survey to reassess fertility estimates.
42
APPSIM Working Paper no 5
References
Abello Annie, Kelly Simon, King Anthony. 2002. Demographic projections with
Dynamod-2. Canberra, NATSEM, University of Canberra, 21, 50 p.
ABS. 2004. Births Australia. Cat. 3301.0 www.abs.gov.au
ABS. 2004. Population projection Australia. Cat. 3222.0 www.abs.gov.au
Bacon Bruce, Pennec Sophie .2007. Calibration and alignment in APPSIM, mimeo,
NATSEM.
Duée Michel, Rebillard Cyril, Pennec Sophie. 2005. Les personnes dépendantes en
France: Evolution et Prise en charge. IUSSP website (www.iussp.org)
International Union for the scientific study of the population. XXV
International Population conference., Tours (France), 18-24 juillet 2005.
Duée Michel. 2005. La modélisation des comportements démographiques dans le modèle de
microsimulation DESTINIE. Paris, G 2005 / 15, 47 p.
Favreault Melissa, Smith Karen. 2004a. A Primer on the Dynamic Simulation of Income
Model (DYNASIM3). The Urban Institute.
Favreault Melissa, Smith Karen. 2004b. A Primer of the Dynamic Simulation of Income
Model (DYNASIM3). The Urban Institute, 22 p.
Flood Lennart, Jansson Fredrik, Pettersson Thomas, Sundberg Olle, Westerberg
Anna. 2005. "SESIM III - a Swedish dynamic microsimulation model." Sweden:
Ministry of Finance.
Fredriksen Dennis. 2003. The MOSART model - a short technical documentation. In:
Norway Statistics Paper presented at the International Microsimulation Conference
on Population, Ageing and Health: Modelling our Future, 7-12 December. Canberra.
Hammel Eugene A., Wachter Kenneth W., McDaniel C. K. 1981. The kin of the aged in
2000 A.D. In: Morgan James, Oppenheimer Valerie, Kiesler Sara New York,
Academic Press, vol. 2 -Social Change, p. 11-39.
Hammel Eugene A, Mason Carl, Wachter Kenneth W. 1990. SOCSIM II. A
sociodemographic Microsimulation Program Rev. 1.0. Operating manual. 29, p 76(5)
Harding Ann. 1993. Lifetime Income Distribution and Redistribution: Applications of
Microsimulation Model. Amsterdam, North-Holland.
Harding Ann , Gupta Anil (eds). 2007. Modelling our future: Population Ageing, Social
Security and Taxation. Amsterdam, North-Holland.
Holm Einar, Holme Kirsten, Makila kalle, Mattson-kaupi Mona, Mortvik gunnel. The
Sverige microsimulation model - content, validation and example applications.
Kulturgeografiska Institutionen/SMC - university of Umea, 54 p.
APPSIM. Modelling fertility and mortality
43
Hyrenius Hannes, Adolfsson Ingemar. 1964. A fertility simulation model. Göteborg,
Distributor: Almqvist & Wiksell, 31 p. p.
Hyrenius Hannes, Adolfsson Ingemar, Holmberg Ingvar. 1966. Demographic models.
Göteborg, v. p.
Imhoff Evert Van, Post Wendy. 1997. Methodes de micro-simulation pour des
projections de population. Population (French Edition), 52 (4, Nouvelles approaches
methodologiques en sciences sociales), p. 889-932.
Kannisto, Väinö Development of Oldest-Old Mortality, 1950-1990: Evidence from 28
Developed Countries. Odense University Press, Odense, 1994; ISBN: 87 7838 015 4.
Keefe Janice, Légaré Jacques, Carrière Yves. 2004. Projecting the future availability of
informal support and assessing its impact on home care services, Part I: Demographic
projections. Health Canada. Halifax: Mount Saint Vincent University.
Le Bras Hervé. 1973. Parents, grand-parents, bisaïeux. Population, 28 (1), p. 9-38.
Léridon Henri. 1977. Human fertility: the basic components. Chicago, University of
Chicago Press, 202 p.
Léridon Henri. 2004. Can ART compensate for the natural decline in fertility with
age? A model assessment. Human Reproduction 19(7): 1548-1553, July 2004.
McDonald Peter. 2005. Has the fertility rate stopped falling. People and Place, 13 (3),
p. 1-5.
O' Donoghue Cathal. 2001. Dynamic Microsimulation: A Methodological Survey.
Brazilian journal of economics, 4, p. c 77.
Orcutt Guy H. 1961. Microanalysis of socioeconomic systems; a simulation study. New
York, Harper, xviii, 425 p. p.
Pennec Sophie. 1997. Four-Generation Families in France. Population an English
Selection, 9, p. 75-100.
Scott Anne. 2003. Implementation of demographic transitions in the SAGE Model. London,
5, 54 p.
Sheps Mindel C., Menken Jane A., Radick Annette P. 1973. Mathematical models of
conception and birth. Chicago, University of Chicago Press, xxiii, 428 p. p.
Smith James D. 1987. The computer simulation of kin sets and kin counts. In:
Bongaarts John, Burch Thomas K., Wachter Kenneth W. Family Demography:
Methods and their applications. Oxford, Clarendon Press.
Spielauer Martin, Vencatasawmy Coomaren P. 2003. FAMSIM: dynamic
microsimulation of life course interactions between education, work,
partnership formation and birth in Austria, Belgium, Italy, Spain and Sweden.
In: Vienna yearbook of population research. p. 143-164.
Statistics Canada. Canada Statistics. 2006. The Lifepaths Microsimulation Model: An
Overview. Dernière modification.
44
APPSIM Working Paper no 5
Ruggles Steven. 1987. Prolonged connections: The rise of the extended family in nineteenthcentury England and America. Madison; London, The University of Wisconsin
Press, 283 p.
Tomassini Cecilia, Wolf Douglas A. 2000. Shrinking Kin Networks in Italy Due to
Sustained Low Fertility. European Journal of Population, 16 (4), p. 353-372.
APPSIM. Modelling fertility and mortality
Appendix A: Fertility Questions in HILDA
45
46
APPSIM Working Paper no 5
APPSIM. Modelling fertility and mortality
47
48
APPSIM Working Paper no 5
APPSIM. Modelling fertility and mortality
49
50
APPSIM Working Paper no 5
APPSIM. Modelling fertility and mortality
51