Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
National Centre for Social and Economic Modelling • University of Canberra • APPSIM - Modelling fertility and mortality Sophie Pennec and Bruce Bacon APPSIM Working Paper No. 7 December 2007 About NATSEM The National Centre for Social and Economic Modelling was established on 1 January 1993, and supports its activities through research grants, commissioned research and longer term contracts for model maintenance and development with the federal departments of Family and Community Services, Employment and Workplace Relations, Treasury, and Education, Science and Training. NATSEM aims to be a key contributor to social and economic policy debate and analysis by developing models of the highest quality, undertaking independent and impartial research, and supplying valued consultancy services. Policy changes often have to be made without sufficient information about either the current environment or the consequences of change. NATSEM specialises in analysing data and producing models so that decision makers have the best possible quantitative information on which to base their decisions. NATSEM has an international reputation as a centre of excellence for analysing microdata and constructing microsimulation models. Such data and models commence with the records of real (but unidentifiable) Australians. Analysis typically begins by looking at either the characteristics or the impact of a policy change on an individual household, building up to the bigger picture by looking at many individual cases through the use of large datasets. It must be emphasised that NATSEM does not have views on policy. All opinions are the authors’ own and are not necessarily shared by NATSEM. Director: Ann Harding © NATSEM, University of Canberra 2007 National Centre for Social and Economic Modelling University of Canberra ACT 2601 Australia 170 Haydon Drive Bruce ACT 2617 Phone + 61 2 6201 2780 Fax + 61 2 6201 2751 Email [email protected] Website www.natsem.canberra.edu.au iii Abstract This paper first reviews the latest international practice when modelling fertility and mortality in dynamic microsimulation models. The next section presents a proposed order for the modelling of demographic processes in APPSIM, Australia’s new dynamic microsimulation model. The final section describes the first set of fertility estimates proposed for use in APPSIM, with these estimates having been calculated from the HILDA panel dataset. iv Author notes Dr Sophie Pennec is a demographer and researcher at the National Institute of Demographic Studies (INED, Paris-France) and visiting researcher at The National Centre for Social and Economic Modelling (NATSEM). Professor Bruce Bacon is an Adjunct Professor of the University of Canberra at NATSEM. Acknowledgments The authors would like to acknowledge comment and input from Ann Harding, Mandy Yap and Simon Kelly. They would also like to gratefully acknowledge the funding and support provided by the Australian Research Council (under grant LP0562493), and by the 13 research partners to the grant: Treasury; Communications, Information Technology and the Arts; Employment and Workplace Relations; Health and Ageing; Education, Science and Training; Finance and Administration; Families, Community Services and Indigenous Affairs; Industry, Tourism and Resources; Immigration and Citizenship; Prime Minister and Cabinet; the Productivity Commission; Centrelink; and the Australian Bureau of Statistics. This paper uses unconfidentialised unit record file from the Household, Income and Labour Dynamics in Australia (HILDA) survey. The HILDA project was initiated and is funded by the Commonwealth Department of Families, Community Services and Indigenous Affairs (FaCSIA) and is managed by the Melbourne Institute of Applied Economic and Social Research (MIAESR). The findings and views reported in this paper, however, are those of the authors and should not be attributed to either FaCSIA or the MIAESR. v Contents Abstract iii Author notes iv Acknowledgments iv 1 Introduction 1 2 Demographics in dynamic microsimulation models 2 2.1 Destinie (France) 9 3 4 5 2.2 DYNASIM III (US) 11 2.3 SAGE (UK) 12 2.4 LifePaths (Canada) 14 2.5 DYNAMOD II (Australia) 15 2.6 MOSART (Norway) 17 2.7 SVERIGE (Sweden) 18 2.8 SESIM (Sweden) 18 Structure of Demographics in APPSIM 19 3.1 Starting population 20 3.2 Order of the modules 20 Mortality in APPSIM 23 4.1 23 Mortality 4.2 Calibration/Alignment 24 Fertility in APPSIM 25 5.1 Data on fertility 25 5.2 Fertility in HILDA 28 5.3 Fertility estimates 28 5.4 Calibration/Alignment 39 5.5 Implementation in APPSIM 39 References 42 1 1 APPSIM. Modelling fertility and mortality. Introduction This paper has been prepared as one of a series associated with the development of the Australian Population and Policy Simulation Model (APPSIM). The APPSIM dynamic population microsimulation model is being developed as part of an Australian Research Council (ARC) Linkage grant (LP0562493), and will be used by Commonwealth Government policy makers and other analysts to assess the social and fiscal policy implications of Australia’s ageing population. The particular focus of this paper is how to most effectively model fertility and mortality within APPSIM. Fertility has been a major topic of interest for demographers for some decades. One reason is that the baby boom that occurred after WWII increased the population from the end of the 40s to the beginning of the 60s. This bulge in the population was not regarded as a problem until recently. However, in the near future, this large population cohort will reach retirement age, resulting in slower economic growth and increasing the burden of caring for them (in terms of income through retirement pensions or in terms of health and the need for formal care and carers). As a result of these pressures, many have questioned the sustainability of current welfare schemes in some developed countries (e.g. see the studies in Harding and Gupta, 2007). The baby boomers are also responsible for initiating the ‘baby bust’ — that is, the low fertility period experienced during the past few decades. Today, with the legalisation of, and the wide access to, contraception, the number of children a person has mainly reflects their choice. Of course for some persons, this choice is constrained by biological factors (sterility), by social factors (no partners during reproductive life) or even by economic ones. This choice results from the interaction of a range of different variables — such as family context, number of siblings, socio-economic status of parents that affect the education level of children, partnership status, and the socio-economic status of the couple. When modelling fertility, one must pay attention not only to the number of children (quantum of fertility) but also to the time schedule of fertility (tempo). The effect of fertility upon such factors as careers and loss of wages (between a woman who stops working to raise her children and a woman who remains in the labour force) can differ greatly. Some parents will move to or choose a part time job. To have a good representation of the spacing between births is also essential — particularly as some policy entitlements depend on the number of children aged under a certain age or when studying kinship and the likely number of children available for caring for elderly parents. A 65 year old retiree who is not disabled can more easily care for his APPSIM Working paper no. 5 2 or her disabled parents than a 55 year old person still in labour force, or an 80 year old person who is also disabled. 2 Demographics in dynamic microsimulation models Demographics are very often the backbone of any dynamic microsimulation models. They are also usually the first step of the simulation. Microsimulation models rely on individuals, so if the number of persons and their characteristics are not well estimated, not only the results for population aggregates will be inaccurate but the results for income, social transfers, and the effects of policy will not be reliable either. Even in models that are not population-focused, this module is important. Microsimulation modelling has been used for a long time by demographers to investigate fertility. Since its definition by Guy Orcutt (Orcutt, 1961)), whose first model included an extensive and detailed demographic module, a number of biological and demographic models have been built in the USA (Sheps et al., 1973), in Sweden (Hyrenius and Adolfsson, 1964; Hyrenius et al., 1966), and in France (Léridon, 1977). The biological fertility models are women-based and reconstitute the monthly fecundity and fertility processes during the reproductive life. Recently, Léridon (Léridon 2004) used such a model to investigate the effect of the postponement of childbearing age on the total number of children — and, in particular, on the number of children that people will not have due to age-acquired sterility that prevents them from having the number of children they would like to have and its effect on the total fertility rate. The demographic models (Hyrenius et al., 1966), Schofield (Smith, 1987); and more recently the Family and Fertility Surveys (Spielauer and Vencatasawmy, 2003) also investigate fertility but using different data, age and parity-specific fertility rates, instead of using conception rates, and then following the outcome of each conception. They investigate the fertility process both from the quantum and tempo point of view. Historical demography has also heavily used microsimulation methods to reconstitute the population of the past (Hammel et al., 1990), (Smith, 1987), Ruggles (Ruggles, 1987), to investigate demographic changes over time, to try to understand how to reconcile macro and microdata and to understand the dynamics underlying the results of two datasets of the same population. A related use, and often one of the main interests of demographers using microsimulation, is that it allows the study of kinship. Whether they live in the same household or not, ties between members of a family can be kept and the size and 3 APPSIM. Modelling fertility and mortality. characteristics of families can be investigated (Hammel et al., 1981), (Imhoff and Post, 1997), (Le Bras, 1973), (Pennec, 1997). With the ageing population and the potential threat to the care of the disabled elderly, microsimulation is the perfect tool to estimate the future population of potential carers. As the ties between the different members of the family are kept, the number of carers can be computed — not only their number, but also their characteristics (age, sex, being in the labour force or not, being retired or not, being disabled themselves or not) (Keefe et al., 2004), (Duée et al., 2005; Tomassini and Wolf, 2000). This is why many models use a set of covariates to try to create a better distribution and include some time and birth spacing variables. This is, of course, critical in the case of the demographic models, as it is their aim to mimic fertility both in quantum and in tempo. But this is often not the only goal: more policy oriented models often also try to have a good representation of fertility and of family structures. The following section of this paper outlines how demographics have been modelled in different dynamic microsimulation models. For some, demography is only a means to achieve a plausible, total number — and, maybe, the age and sex structure of the population. In that case, the modelling is therefore quite simple. But most models include quite detailed demographic modelling, because they need to also track the characteristics of the population as a whole and, very often, household and the family characteristics. For example, the family is often a decision-making unit in terms of economic behaviour and such expenses as housing. Welfare benefits, taxation schemes and some other variables are also often family-based. Tables 1 and 2 give some examples of how births and deaths have been modelled in models like APPSIM all around the world. It shows the different modelling approaches and the covariates that came through the imagination of the modellers. This is a good starting point when investigating approaches for new modelling, as some “regularities in the covariates can be seen”. In summary, Table 1 suggests the modelling of fertility is mainly based on demographic data, such as age of the mother, marital status, duration since beginning of the union/marriage, number of previous children, and duration since previous birth. But, whenever possible, data on socio-economic status are also included — either by education level or by income level. A third set of variables are more dedicated to the modelling of the individual’s current situation, such as being in full time education; being in the labour force, or having a very young child. Where fertility is concerned, different benchmarks must be used — some period ones (like total number of births and total fertility rates); and some cohort ones (like the distribution of women by number of children and completed fertility rates). Not only APPSIM Working paper no. 5 4 the total number of children and their distribution, but the age at childbearing and the duration between two births are of importance and must be included in the modelling (as they can impact on the labour force participation of the mother, or change the receipt of benefits that vary according to the number or age of children, or are family related). Childlessness is another important phenomenon that has to be taken into account, as it has a strong impact of some of the topics related to ageing. Mortality is a one time event and therefore is easier to model than fertility. Age and sex mortality rates are the basic indicators used in the models and, whenever possible, these rates also include health, family and socio-economic variables. Health variables are mainly related to disability, while socio-economic status can include higher education level or being in the labour force or occupation (and, for those retired, previous occupation). Some models use survival analysis, but most of them use transition rates modelled by regression techniques (hazard functions or logistic regression equations). We will not go further in the comparison of models here as our aim when gathering the covariates used in the different models is not to compare the models. Obviously, the modelling of each event and the level of detail used depends on one side on the aim of the model; some objectives and policies may be more or less related to demographic behaviour. On the other hand, the availability of data severely constrains modellers, to different degrees in different countries. As Harding points out: ‘dynamic models are only as good as the data upon which they are based’ (Harding 1993, p. 29). Table 1: Fertility covariates in selected dynamic microsimulation models Births Camsim set of parity progression ratios, time between marriage and first birth. Only married women can bear a first child but the birth interval distribution from marriage to first birth permits pre-marital conceptions (Smith, 1987) CORSIM Age, birth(t-1), birth(t-2), duration of current marriage, earnings, family income, homeowner status, marital status, parity, schooling status, work status (F/T, P/T) , have child, marital status, race, work status Demogen age, marital status, parity Destinie parity 1: duration since end of education, age , education parity 2 (if same union as parity 1): duration since previous birth, education, age parity 3 and over (if same union as parity 2): duration since previous birth, education, age, parity parity 2+ (if not same union as previous birth): duration since new union, age, parity (Duée, 2005) DYNACAN Age, marital status, education level, employment status, parity Dynamite age, marital status, parity DYNAMOD models pregnancy and the outcome of the pregnancy (still born, live birth). 3 distinct kinds of births: premarital, 1st marital and subsequent marital (marital=de jure and de facto) equations based on women's education participation, educational qualifications, employment status, age. Marital births are also based on marital status, employment status of the husband, duration of the relationship, parity, duration since last birth (Abello et al., 2002) DYNASIM III Seven equation parity progression model; varies based on marital status; predictors include age, marriage duration, time since last birth; uses vital rates after age 39 (age-race-parity-specific probabilities); sex of newborn is assigned by race; (Favreault and Smith, 2004b) Famsim Age, Marital status, duration in marital status, in education, duration of schooling, in work, duration work, trend, duration since last birth (birth parity 2+), parity (birth parity 4+) (Spielauer and Vencatasawmy, 2003) Harding age, sex and parity (only for married women) (Harding, 1993) Irish dynamic microsimulation model Age, parity, education level, father's education level, marital status APPSIM. Modelling fertility and mortality Model 5 APPSIM Working paper no. 5 6 Italian cohort model age of mother, parity Japanese dynamic model age, marital status, parity kinsim age, marital status, parity (Imhoff and Post, 1997) Biological models age, marital status, time since last pregnancy according to outcome, sterility (Hyrenius and Adolfsson, 1964; Léridon, 1977) LIFEMOD marital status, age, parity progression ratio LifePaths age, marital status, parity, time [education, birth cohort, age at marriage, province of birth; the decision to have a child is modelled and then the different steps towards the birth, 1) each birth occurs nine months after conception, 2) a spell of infertility lasting three months follows each birth and 3) a new fertile spell begins immediately after that. As a side benefit, the careful attention given to the timing of these events makes possible straightforward models of, for example, maternity leaves from employment and marriage following pregnancy.](Statistics Canada, 2006) Melbourne age of the mother, marital status, number of dependants Microhus hourly wage rate, disposable income, education, years of work experience, duration since previous child, other socio-economic variables MOSART age, parity, age of youngest child (Fredriksen, 2003) Nedymas age, year of birth, marital status, parity PRISM marital status, age, parity, previous employment status SAGE women living with a partner: age, marital status, duration in marital status, parity and age of youngest child women not living with a partner: age, participation to full education, parity and age of youngest child multiple births: age (Scott, 2003) SESIM Only women who move from their parents can give birth, age 18-49 First birth: age, marital status, pensionable income (quartiles), indicator for market work, highest education. Following birth: parity (up to 3. 4 is the maximum number of children modelled), age, marital status, pensionable income (quartiles), indicator for market work, highest education, age of youngest child(Flood et al., 2005) Sfb3 Age, marital status, duration of marriage and parity SOCSIM monthly rates/ age, sex, marital status, waiting times between events (Hammel et al., 1990) SVERIGE marital status, family earnings, education level, working status (working or not), number of live children already born, whether a birth occurred last year, whether a birth occurred the year before the last, the country of birth of the mother, number of years that the mother has spent in Sweden (Holm et al.) Swedish cohort model age Source: (O' Donoghue, 2001) and authors mentioned in the table Table 2: Mortality covariates in selected dynamic microsimulation models Mortality Covariates Anac Model age, sex Belgian Dynamic model age, sex Camsim age, sex (Smith, 1987) CORSIM Demogen Age, birth place (US or other), education, employment status, family income, marital status, sex, race, marital status age, sex + health module Destinie age, age at end of education, sex, disability (with disability module) DYNACAN gender, age, education, marital status, employment status, disability status, time, region Dynamite age, sex DYNAMOD age, sex, disability DYNASIM II Married women 45-64: age, race, sex, marital status, education, number of children others: age, race, sex, marital status, education DYNASIM III Three equations; time trend from vital statistics 1982-97; includes socio-economic differentials; separate process for the disabled based on age, sex and disability duration derived from (Favreault and Smith, 2004a) Famsim age, sex, marital status (Spielauer and Vencatasawmy, 2003) Harding age, sex, education status (Harding, 1993) Irish dynamic microsimulation model Age, occupational group, gender, education APPSIM. Modelling fertility and mortality Model 7 APPSIM Working paper no. 5 8 Italian cohort model age, sex Japanese dynamic model age, sex kinsim age, sex (Imhoff and Post, 1997) Biological models age, sex (Hyrenius and Adolfsson, 1964; Léridon, 1977) LIFEMOD age, sex LifePaths age, sex, [for one study and for elderly people, disability status, living in nursing home or not] Microhus age, sex Midas age, ethnicity, gender MINT age, time, ethnicity, education, marital status, permanent income, disabled MOSART age, sex, disability, marital status, educational attainment (Fredriksen, 2003) Nedymas age, sex, marital status Pensim/2 age, gender PRISM disability status, age, sex, years of disability SAGE age, sex, social class (Scott, 2003) SESIM age 0 to 29 : sex and age age 30 to 64: sex, age, indicator for early retirement, pensionable income (quintile), marital status age 64+: sex, age, indicator for early retirement at 64 years of age, marital status, highest level of education. (Flood et al., 2005) Age, sex, family status Sfb3 Age, sex, marital status, family earnings, education level, working status(Holm et al. ) SVERIGE Swedish cohort model age, gender Source: (O' Donoghue, 2001) and authors mentioned in the table. APPSIM. Modelling fertility and mortality 9 2.1 Destinie (France) The dynamic microsimulation model DESTINIE has been built by the French bureau of statistics (INSEE) to analyse the long-run situation of pensioners — accounting for the heterogeneity of careers and preferences and for changes in the labour market, as well as for the demographic structure and retirement rules. In this respect it has been used to simulate the impact of the 2003 French Pension Reforms on, for example, the age of retirement and the replacement rate, as well as the long-term number of retirees and the financial equilibrium of the pension system. The scope of the model is national: it is based on a representative sample of the whole French population. Although DESTINIE’s primary focus is the simulation of pensions, the model can cope with distributional aspects of other public policies — especially if one is interested in a life-cycle approach. In this regard, one of the future developments of the model is to implement individual health expenditures in order to analyse the redistributive properties of the French health care system. DESTINIE is based on individual data derived from the 1998 Financial Assets Survey collected by INSEE. The original sample has been reweighted to account for the demographic structure of the whole French population, as known from the 1999 French Census. Thus, the initial sample of DESTINIE contains about 20 000 households and 50 000 individuals. In order to study intergenerational relationships, individuals from the sample are connected to each other through the artificial imputation of kinship ties. The demographic events simulated are union, breaking-up, birth of a child, death and migration. The parameters of the equations are adjusted so that the probabilities predicted by the model fit the long-term demographic projections made by INSEE and the French Demographic Institute (INED). The demographic events modelled are assumed to be independent from economic events, in order to give the model its robustness and to be able to change economic trends without having to change the behaviour included in the equations of demographic events. Nevertheless, some socio-economic differentials are included (through the use of the relative age at the end of education attendance). The demographic events are simulated in the following order: entry of migrants; death; then, for those living in couple, the possible union disruption is modelled; for those not living in a couple, they can be candidates to form a new union; and, lastly, for those living in a couple, they may have births. Destinie mainly uses the distribution of age at the end of school attendance as a proxy for socio-economic status. To take into account the changes over time in the 10 APPSIM Working Paper no 5 length of school attendance according to birth cohort, it is relative age at the end of school attendance that is used to distribute the population into one of three categories. These are first, people with short studies (studies shorter by more than one year of the relative age of the birth cohort); second, medium studies (when studies are lower or higher by less than one year to the relative age of the cohort); and then, third, long studies (for those who attend school more than one year above the average estimated for the cohort). Fertility The fertility equation estimates are based on the survey of family history run with the 1999 census. It is a self-reported survey, with 235 000 women and 145 000 men aged 18 and over. This survey is a retrospective survey, investigating mainly the fertility and union history. The probabilities are estimated using events arising in years 1995-1998 (as the model starting date is 1998). Probabilities of giving birth are related to age, parity, relative age at end of school attendance, duration since last birth for parity 2 and over, and duration since the end of school attendance and since the beginning of the union. For the 1st birth, two equations are estimated, according to the category of the age at end of school attendance — one for those whose have studied less than the average of the cohort and one those with medium and long studies. The covariates are age of the women and the duration since age at end of school attendance (or beginning of the union if the couple is formed before the end of school attendance). For the following parity, different equations are estimated according to whether the union in which the previous birth arises has been disrupted or not. If there is no union disruption, the probability for the second birth is estimated separately by length of studies (short vs. medium/long term) and the covariates are duration since first birth and age. For the probabilities for parities 3 to 6, the length of studies has been introduced as a covariate, but separate equations were not useful. When there is a union disruption since the previous birth, the duration used is not the ‘duration since previous birth’ but the ’duration since the beginning of the new union’. The length of studies is not included. APPSIM. Modelling fertility and mortality 11 Mortality Each year, each individual in the sample has a probability of dying, according to his/her age, sex, and age at end of school attendance. In the study of ageing and disability, the probabilities were additionally altered according to disability status. 2.2 DYNASIM III (US) DYNASIM 3 is a dynamic microsimulation model designed to analyse the long-run distributional consequences of retirement and ageing issues. The model simulates Social Security coverage and benefits, as well as pension coverage and participation, benefit payments and pension assets. It also simulates home and financial assets, health status, living arrangements and income from non-spouse family members (coresidents). In addition, it calculates SSI eligibility, participation and benefits. The DYNASIM 3 input file is based on the 1990 to 1993 survey of income and program participation (SIPP) panels, a self weighting sample of over 100 000 people and 44 000 families. They then randomly output families based on the panel-adjusted average person weight. DYNASIM 3 focuses on nuclear families; subfamilies and unrelated individuals are treated separately. The DYNASIM 3 model includes three sectors: demographics, economics and the taxes and benefits. The computer implementation follows the structure of its predecessor and includes two separates microsimulation models, the Family and Earnings History (FEH) model and the Jobs and Benefits (JBH) model. The FEH model processes the full sample once for each year of simulation simulating demographic and annual labour force behaviour for each individual on the input file. The output of the FEH model is a set of longitudinal demographic and labour force histories that is the input of the JBH model. Fertility. The DYNASIM 3 fertility module is designed to produce the age-race-sex population distribution in any future year so that the number of people at risk of paying Oldage, Survivors and Disability Insurance (OASDI) taxes and/or collecting OASI or DI benefit is correct. It is also designed to produce the proper distribution of birth timing and sequencing to use as an input for generating career trajectories, especially for women. The model uses the maximum amount of information possible on the nature of the current relationship between women’s characteristics and their childbearing, to model births both historically (1993-2003) and in the future (from 2004 onward). 12 APPSIM Working Paper no 5 The data used to estimate fertility include the National Longitudinal Survey on Youth (1979-94), Vital Statistics and the OACT (intermediate assumptions of the OASDI trustees). The probability of giving birth in a year is conditional on the marital status of the woman and the number of children she has. Seven equations are estimated according to sub-groups allowing the full sets of interaction terms between the defining variables (marital status and parity) and the independent variables. The equations are for the following groups: 1. unmarried teens at risk of first births; 2. all other unmarried women at risk of first births; 3. unmarried women at risk of second births; 4. unmarried women at risk of third or higher births; 5. married women at risk of first births; 6. married women at risk of second births; and 7. married women at risk of third or higher births. For women aged 40 and over, the estimates are derived from vital statistics to assign an age-race-parity specific probability. Multiple births are assigned by age and race, and the sex of the newborn by race. Mortality DYNASIM 3 predicts death using a four-stage process that relies both on micro-level data and aggregate data. The micro-level data characterize within-cohort sex group differentials, while aggregate data capture the age-race-sex-specific trends and level. Different equations of the individual probabilities of dying are estimated according to sex and age as functions of individual fixed characteristics and some varying socio-economic attributes. The vital statistics are used to incorporate a time trend (1982-97). For those receiving disability insurance, the probabilities are assigned using the estimates derived from aggregate data. The last stage calibrates the expected probability of death to achieve the targets produced by the social security actuaries. 2.3 SAGE (UK) The SAGE dynamic microsimulation model is designed to provide projections to inform the development of social policy in Britain for the twenty-first century, focussing on the implications of population ageing for pensions, and issues regarding health and long-term care needs. It is a full population model, and covers demographic processes, education, labour market participation, earnings, pension accumulation, health and disability, and support networks. The model uses a base data set derived from the 1991 Census of Great Britain, and uses time-based processing with an annual cycle. APPSIM. Modelling fertility and mortality 13 The ordering of events The demographic changes are generated in a series of modules, starting with mortality, then fertility, union dissolution, and then defacto union formation and marriage. Fertility Both the fertility and partnership transition modules use the British Household Panel Survey (BHPS) as the source data to generate the probabilities. The BHPS is a multipurpose longitudinal study that began in 1991 with a nationally representative sample of around 5,500 households and 10,300 individuals. From the BHPS, one partnership and fertility person-year file was created using records from 1992 to 1999. Although the woman’s parity is known on the BHPS, only the observations of the number and ages of her children aged under 16 and currently in the household (including natural, step and adopted children) at each interview were used as predictor variables, in order to be consistent with the data available from the Sample of Anonymised Records of the UK 1991 Census, used as the base data. Women who had been widowed in the last year were excluded from the fertility module. Two predictor equations for the likelihood of giving birth according to partnership status are estimated. For women living with a partner, an interaction effect between the woman’s marital status and the duration of her cohabitation with her partner was accommodated by creating one composite variable. The same approach was adopted to incorporate an interaction between the number of children and the age of the youngest child. Participation in education was strongly negatively associated with the likelihood of a woman not living with a partner giving birth in the next year. When the other factors were controlled for, neither ethnic group nor social class was significantly associated with fertility. Once a woman has been selected to give birth in the coming year a look up table, based on rates of multiple births in Britain in the early 1990s, is used to select women for multiple births according to their age. The maximum number of multiple live births that can be allocated is triplets. The allocation of the sex of a child follows the sex ratio at birth in Britain of around 1.05 male to female babies. The probability of a new baby being a boy is therefore set at 0.512 14 APPSIM Working Paper no 5 Mortality The annual probability of death for individuals depends on their age, sex and social class. These probabilities are based on abridged life-tables for the years 1987-1991 produced by the Office for National Statistics (ONS) using the ONS Longitudinal Study (LS). The LS is a dataset produced by record linkage of Census and vital event information for individuals living in Britain. Then the grouped probabilities are converted to single year probabilities, using the vital registration data of death rates by single age and sex in 1991. 2.4 LifePaths (Canada) LifePaths is a microsimulation model designed to simulate life histories, taking account of birth, death, immigration status, inter-provincial migration, marital history (including common-law unions), educational history, employment history and the birth and presence of children at home. It is used to analyse government policies having an essentially longitudinal component and whose nature requires evaluation at the individual or family level (such as post-secondary education costs and benefits or public pension sustainability). It can also be used to explore a variety of societal issues of a longitudinal nature, such as intergenerational equity or time allocation over entire lifetimes. LifePaths models individuals from their birth. When creating a new individual, place of birth (in a specified province or territory, or outside Canada), sex and date of birth are assigned. If they were born outside Canada, their province of entry to Canada and age of immigration are also assigned. Given that the earliest birth occurs in 1872, LifePaths can simulate a complete population aged 0 to 99 starting in 1971. By default, LifePaths simulations closely approximate official estimates of the Canadian population by age, sex and province in each year over the period 1971 to 1999, as well as Statistics Canada’s medium population projection over the period 2000-2026. The population represented does not include non-permanent residents, a large number of which are foreign students. Fertility The historical series of births by sex and province that are used to initialize simulations were derived from the 1911 and 1921 Censuses together with annual birth registrations between 1921 and 1998, as well as from immigration records. For the period prior to 1921, the 1921 Census population was ‘reverse survived’: the number of births that would have had to occur to produce the observed number of survivors in 1921 was determined using mortality probability estimates. For 1921 to 1971, historical population estimates by age, sex and province were used. Special APPSIM. Modelling fertility and mortality 15 estimates for Newfoundland were derived using the 1911, 1921 and 1935 Censuses of Newfoundland. For the period 1972-1999, the official population estimates were used. These were available by age, sex and province and consist of the following components: total population, non-permanent residents, immigrants, emigrants, returning Canadians and internal migrants. For the period 2000-2026 the Statistics Canada medium population projection was used. Historical lifetime internal migration data was based on decennial Census data from 1911 to 1971. Base probabilities of individuals moving between each pair of provinces were derived using the Family AllowanceChild Tax Credit data (1972-1996). Estimates of family level migration probabilities were obtained (using the base probabilities as benchmarks) from 1991 and 1996 Census data on place of residence one year prior to the Census. Mortality Mortality in the LifePaths model is determined by re-assessing the chance and timing of death at each birthday. The chance of dying is based on the age-specific mortality probability of Canadians sharing the sex and year of birth of the simulated individual. The process of re-assessing mortality continues until death or until the individual reaches the maximum allowed age of 115 years. Individuals who are destined to immigrate to Canada are not exposed to a risk of death until they arrive. This avoids simulating individuals who then die before reaching Canada and who would, therefore, make no contribution to the simulation reports. This practice gives recognition to a reality: the appropriate mortality risk for this sub-population is unknown. The historical cohort mortality data were derived from death registration statistics. The future mortality of recent cohorts was projected using mortality assumptions from the Statistics Canada medium population projection for the period 2000-2026. 2.5 DYNAMOD II (Australia) DYNAMOD-2 is a dynamic microsimulation model of the Australian population that is designed to project characteristics of the population over a period of up to 50 years. The model operates with a 1 per cent sample of the Australian population — about 150 000 records in the starting population — and generates lifetime profiles of people in terms of demographic events (fertility, mortality, couple formation and dissolution, and overseas migration), education participation and attainment, labour force activity and earnings. 16 APPSIM Working Paper no 5 DYNAMOD uses survival functions to model demographic events. These survival functions operate with a period of one month (pseudo-continuous time, compared with annual models). Fertility Fertility is modelled in DYNAMOD-2 through pregnancy and childbirth events for women. Once the woman is aged 15 or a birth has occurred, the model predicts the date of the next childbirth for a woman. Once the date of birth is known, a pregnancy start date is determined. Once pregnancy begins, only the death of the mother can stop a child being born. The model simulates only live births but it retains the possibility of confinements resulting in multiple children (twins or triplets) being born. Three distinct types of birth — premarital, first marital, and second and subsequent marital births. The more general definition of marital status is used in the model, where it refers to both legal (de jure) marriages and de facto relationships. Thus, premarital births refer to births among women who have never been married or entered into a de facto relationship. First marital birth refers to the first live birth after entry into the first marital union, while second and subsequent marital births refer to all births after the first. Survival functions are used to predict the time until occurrence of pregnancy among women aged between 15 and 49. The estimation of the time until childbirth for the three types of childbirth depends on educational participation, educational qualifications, employment status and age of the women. First and subsequent marital births are influenced by additional characteristics including marital status, employment status of husband, duration of the marital relationship, parity and duration since previous birth. Survival functions were estimated from the “negotiating the life course” survey. It is a panel conducted by the Australian National University which covers 2547 women aged 20–59 years at the 1st wave and collects historical information on the women and their spouses covering the period 1971–86. Mortality Mortality rates are influenced by four factors in DYNAMOD-2 — gender, age, year of birth and disability status. Those for the disabled are derived from the residual rate when the able-bodied rate is removed from the overall mortality rate. Mortality rates for the population as a whole are targeted to conform to ABS projections, while mortality rates for the able-bodied are derived from ABS mortality rates on death due to external causes. Mortality rates for the disabled are calculated based on the difference between these two rates. The model incorporates mortality improvements during the forecast period. The assumption is that improvements in mortality for the APPSIM. Modelling fertility and mortality 17 able-bodied are half those for the disabled. Improvements in overall mortality rates in the model are based on ABS improvement projections. 2.6 MOSART (Norway) The development of the first version of the MOSART model started in 1988. The first version comprised demographic events, education and labour force participation. The second version extended the model with public pension benefits and labour market earnings and in the third version, the household formation and non-earnedincome, taxation, savings and wealth were added. The starting population of MOSART is a 12% sample of the population in Norway in 1993. This initial population gathered information on marriage, birth histories, educational level and activities, pension status and pension entitlements in the National Insurance Scheme, rehabilitation schemes, special pension entitlements for civil servants, wealth and household status. The latter includes to some degree relations to spouses, parents and children. Most of the information is represented as annual data back to 1985, and pension entitlements are represented as annual labour market earnings back to 1967. The information is gathered from registers run by the Directorate of Taxes, the National Insurance Administration and Statistics Norway. Fertility Births are simulated for women with fertility rates from 1989, depending on age, number of children and age of the youngest child. Each time a birth occurs, a child is added to the model population. In the simulation of historical data, fertility rates from 1989 are adjusted proportionally each year to let the simulated number of births be equal to the actual level each year. This adjustment is roughly the same as adjusting the fertility rates against the periodic total fertility rate. Mortality Mortality depends on gender and age, and the baseline mortality rates from 1989 are adjusted proportionally, corresponding with life expectancy at birth each year. Furthermore, mortality is higher in the simulation for single, disabled and those with a low education level. The covariates are roughly estimated from various sources of mortality statistics, and included because these differences are of significance for pension expenditure. Simultaneously the base line mortality is adjusted to ensure that these covariates do not influence the average mortality by gender and age. 18 APPSIM Working Paper no 5 2.7 SVERIGE (Sweden) The SVERIGE spatial microsimulation model is the work of the Spatial Modelling Centre. It is based on the CORSIM model but has added time, geography and agentbased modelling. Fertility Fertility behaviour is influenced directly in the model by a number of individual and family attributes generated in some other modules, such as employment and earnings, education and marriage. The probability of having a baby is determined every year for all females between age 15 and 44 inclusive. Two logistic regressions are estimated separately for those who are married and those who are not. The equations are developed on a sample of 100 723 women aged 15-44 from the Swedish population in 1990. The covariates are age, education, income, labour force participation and, for the unmarried women, whether living as part of a couple. Mortality The estimations of the equations are based on a sample containing 458 000 Swedes in 1990. These individuals were followed over a five-year period (1990-1995) for occurrence of death. Different equations have been estimated according to age. For persons under 25 years old, an exponential time trend equation of the period 1985-1995 was estimated with three parameters α1, α2 and α3. α1+ α2 equals the 1985 mortality rate and α3 is negative to take into account the decline over time of the mortality rates. The estimation is done separately for males and females and according to 6 age-groups (infant mortality, 1-4, 5-9, 10-14, 15-19 and 20-24). For individuals over 25 years old, the probability of dying is estimated for a five-year period and then converted to a one-year probability. Three equations corresponding to three age-groups are determined — working age population (25-59 years old); young old (those aged 60-74) and the older old (those aged 75 years and over). The covariates are age, education level, and marital status for all age groups. For those of working age, labour force participation and the level of income are also included. 2.8 SESIM (Sweden) SESIM is a dynamic microsimulation model developed by the Swedish Ministry of Finance in collaboration with researchers from different universities. The project APPSIM. Modelling fertility and mortality 19 started in 1997, and SESIM’s first mission, an evaluation of the Swedish national system of study allowances, took place in 1998. Since the year 2000 the focus has shifted from education to pensions. To evaluate the financial sustainability of the new Swedish pension system is a major purpose of SESIM. This new focus has also implied that SESIM has been developed into a general microsimulation model that can be used for a broad set of analyses. A sample of some 100,000 Swedish citizens is used as a simulation base. The data is sampled from the 1999 year wave of LINDA, a large longitudinal dataset. Additional observations are sampled from the set of all individuals living abroad having Swedish pension rights. Fertility Women aged between 18 and 49 years old and not living with their parent can have a child. The probability of having a first child depends on age, marital status, pensionable income (quartiles), indicator for market work, and highest education level. For the parity up to 3 (no women has more than 4 children), different equations were estimated using the same covariates, plus the duration since the previous birth. Mortality Three different set of equations have been estimated according to broad age groups — those less than 30, the 30-64 year olds, and the 65 years and over. For the 0-29 year olds, the probabilities are determined by age and sex. For the 30-64 year olds, some variables related to labour force participation and income are added (indicator of early retirement, quintile of pensionable income) as well as marital status; for those aged 65+, the relevant variables are age, sex, marital status and highest level of education. 3 Structure of Demographics in APPSIM After reviewing the starting population chosen for APPSIM and the proposed order of events in the model and in the demographic module, this section describes the demographic module of APPSIM and how the different equations have been estimated. 20 APPSIM Working Paper no 5 3.1 Starting population After reviewing potential datasets, it has been decided that the starting population for APPSIM will be based on the 1 per cent Census Sample, drawn from the 2001 Census (Kelly, 2007). It consists of a sample of 1 per cent of private dwellings, with their associated family and person records, and a 1 per cent sample of persons from all non-private dwellings together with a record from the non-private dwelling (ABS, 2001) — that is 75,451 dwellings, 79,320 families and 188,013 persons. The 1 per cent census has the advantage of gathering information on both private and non-private dwellings, and has a big sample size. The main drawback is that the number of variables is limited and imputations must be performed to be able to run simulations. 3.2 Order of the modules During each period of time, each individual goes through all the different groups of events. This includes the set of demographic events (including household formation), education, labour force participation, earnings, housing, and financial events such as unearned income , household expenditure, taxation, and finally welfare (with an emphasis in APPSIM on health care and aged care). Figure 1: The different modules in APPSIM Health & Aged Care New Demographics Year Household Formation & Movement Taxation Social Security Education & Training Labour Force Household Assets & Debt Other Income & Expenditure Earnings Housing APPSIM. Modelling fertility and mortality 21 As you can see, demographics are the first group of events. Because of the model structure, that means that we assume that demographic events can affect the other events that happen in the same year — but the later events, such as labour force status and earnings, can affect demographic events only in the following year. For example, this means that having a baby in year ‘t’ can affect the probability of labour force participation in that year — but being in the labour force in year ‘t’ can only affect the probability of having a baby in year ‘t+1’. 3.3 Order of the events in the demographic module After an overview of the different events included in the demographic module, we will focus on the data and estimates for fertility and mortality. Figure 2: The events and their order in the demographic module Start of year Population (1) Population (5) Immigration Emigration Population (2) Population (4) Mortality Fertility Population (3) The proposed order of events in the fertility, mortality and migration modules is summarised in Figure 2. In dynamic microsimulation models such as APPSIM, we apply probabilities to determine whether each event will happen or not — for example, every year we will test whether a person will stay alive or will die (or will emigrate or have a child). 22 APPSIM Working Paper no 5 The probability that an event arises is estimated using the most detailed data possible and whenever possible using longitudinal data — as this allows determination of transitions between different states. For example, the probability of divorce is more related to the number of years spent in marriage than to age. The order of the variables is not neutral, as if we model first mortality, a person selected to die cannot emigrate nor have a child later on. The order of the modules must thus reflect a logical interaction pattern and the probabilities of each event must be conditional probabilities taking into account the implication of the order. In microsimulation models, each event is modelled in a more or less dependent way. While projections which run using a cohort-component method use the average population, and not the population at the beginning of the projection step for events such as fertility, (thus being able to take into account the fact that more than one event can arise during the year), in microsimulation, each event arises independently. If death is the first event then, for example, any women selected to die will not be able to give birth. We impose a causality in the model that has to be taken into account in the estimate — at least when the probability that both events arise during the same period of time is not negligible. The order proposed here is first, immigration, then mortality, fertility and emigration. With this order, we can take into account: • Immigrants are alive when they arrive but can die and have children as soon as they arrive on Australian soil; • Mortality before fertility: this assumption reduces the number of orphans at birth but, in a country like Australia, it is fortunately a quite rare event to have the death of a woman in the year of the birth of her child. These two events are estimated independently — that is, the risk of mortality is not dependant upon whether a person will have a baby. Therefore, this is a less risky assumption in many respects, as estimating births after mortality will reduce orphanhood in the first year of life, but the error will not be great and we assume this error to be better than the reverse. The other reason for using this order is that the data used to determine fertility estimates are based on surviving women. • Emigrants are alive at the moment of their departure but could have had a child just prior to emigration. This order presents a drawback related to the fertility and mortality of migrants. If we assume like all projections that immigrants and emigrants stay on average half the projection period in the country, death and fertility rates of migrants are half those who are staying the whole year in the country. While it is possible to have a variable distinguishing immigrants and applying half the rates to them, it is not APPSIM. Modelling fertility and mortality 23 possible to do the same for emigrants because we don’t know where they are when fertility and mortality occur. It can lead to a small increase in mortality and fertility. It is possible to avoid this problem by using the following method1. Assume that half the immigrants will arrive at the beginning of the period and half of them will arrive at the end of the projection period. Those arrived at the beginning of the projection period will face the whole year the events death and births while those arriving at the end of the period will not face any event. The same way for emigrants, we consider that half of them will leave the country at the beginning of the projection period and therefore will not face any other events in the country. The other half of the emigrants will leave at the end of the period and will face the other events during the whole year. This shortcut is possible with the assumption that they stay on average half the period; instead of all the immigrants and emigrants facing the events with half the probabilities because they stay half the year in the population; we will have half the immigrants and emigrants with the full rate. 4 Mortality in APPSIM 4.1 Mortality No data are available within Australia according to socio-economic level, disability status, occupation foreign vs. Australian born etc at individual level - only at SLAs (statistical local area). The first version of mortality within the model is therefore very simple as it uses mainly the probability of dying by age and sex, determined by ABS for their population projections, medium scenario (ABS, 2004 and special communication). We are still investigating how we can include some differentials in the model. The ABS data are up to age 100+, but we wanted to increase the upper age-group, given the likely increase in the number living past 100 years. Based on the work by 1 The authors would like to thank Laurent Toulemon for this very pertinent suggestion. 24 APPSIM Working Paper no 5 Thatcher-Kannisto2, we have mortality rates from 100 to 105+. We have used this distribution by age to determine the distribution by age over 99 from 2005 to 2050 (Pennec and Bacon, 2007). 4.2 Calibration/Alignment One of the issues that has received much attention during the past decade is the need to align summed micro estimates to benchmark data sources that are regarded as reliable within a country (such as official population projections) (Anderson, 2001). This topic is more fully developed in another paper (Bacon and Pennec, 2007). We note here the aggregates and benchmarks that we should be able to ‘hit’ with the fertility estimates in the APPSIM model Macro environment Health & Aged Care New Year ? Demographics Household formation & movement Taxation Social Security Education & Training Household Assets & Debt Labour Force Other income & expenditure Earnings Housing V0.2 8 Feb 2006 The number of births ought to align with the aggregate number of births based on fertility rates. In theory, it is desirable that the model outcomes also align with external benchmarks by age of the parents, number of births by the mother (first child, second child, etc), marital status of the parents, and aggregate number of multiple births (sets of twins, etc). 2 We accessed to the Kannisto-Thatcher database on old age mortality (K-T database), maintained by the Max Planck institute for demographic research. APPSIM. Modelling fertility and mortality 25 The fertility rate should reflect historical rates where available and the projected rates (general fertility rates and age-specific rates) should be a user-defined input parameter. Different sources of information will be used in order to check and calibrate our estimates. Among these are the vital statistics, some surveys like the ABS survey on family characteristics and the ABS population projections. One proposal for this micro-macro linkage when demographics are concerned is to link a cohort component model to the microsimulation model. Our idea is not only to align the microsimulation results to the “official results” but to built a cohort component model that take into account certain specificities of microsimulation like (1) the events are determined according to the age at the beginning of the simulation step that could not be equal to the age at the event; (2) rates are not applied to the mean population but to the population as it is when this event is calculated. This approach leads to a real consistency of the macro and micro levels; gives a better consistency to the user when he/she wants to use some other scenario e.g even if the user wants to increase only the fertility, it has effects on the number of deaths so this method recalculates all events involved. 5 Fertility in APPSIM 5.1 Data on fertility 5.1.1 Vital statistics Births and confinements are registered by different bodies — the different State Registrars and the Australian Institute of Health and Welfare through the registers of midwifes. These two series present discrepancies, as a time lag exists between confinements and registration. The time difference differs from state to state and according to year. In 2003, the Perinatal Data Collection reported the occurrence of 255,100 live births in Australia, 1.6% more than the 251,200 births registered in the same year (ABS, 2004); (McDonald, 2005). 26 APPSIM Working Paper no 5 The contents of the birth registration forms differ by state but, nevertheless, most of the variables are the same in all the states. One difference lies in the parity of the child; in some states, the parity of the child within the overall number of children of the woman and within the current relationship is available (Queensland, South Australia, Western Australia and Tasmania) whereas, in the other states, only the latter is available. 5.1.2 Specific surveys • The first possible source, one that was used for the DYNAMOD model, is the ANU survey “Negotiating the life course”. It is a panel of 2 247 persons designed to study the interaction between family life and labour force participation. • The second and more recent survey is the Household, Income, and Labour Dynamics in Australia survey (HILDA). HILDA is a broad social and economic survey. As it has a longitudinal design, most questions are repeated each year. In addition, each year a special topic is covered — such as in wave 1 the family background, in wave 2 the household wealth, and in wave 3 retirement and plans for retirement. Private health insurance and youth are covered in wave 4. The panel began in 2001 with a national sample of Australian households occupying private dwellings of 6,872 households and 13,969 individuals. Members of the original survey in 2001 have been traced and interviewed annually, along with members of their new households. 5.1.3 Census The quinquennial census is another useful source of data for fertility analysis. For some years, some specific questions on the total number of children ever born were added to the usual question of the dependent children living in the household. Some interactions of demographic variables with housing, economic, and labour force characteristics can be studied using the census data. A major advantage of the census, like vital statistics, is that it is an exhaustive source — but the number of variables is limited, as is the level of detail for each topic. Vital statistics and census data will be used mainly to benchmark the APPSIM estimates, as our estimates need to rely more on micro-level data able to report interactions between variables. To determine the fertility estimates, we chose to use the HILDA panel data. The reason for this choice is that it contains both longitudinal historic data and cross-sectional data and its sample size is bigger than the ANU survey. One set of benchmarks to check the summed micro fertility estimates against is the ABS population projections. APPSIM. Modelling fertility and mortality 27 5.1.4 HILDA For each responding person, for each year, a set of questions are asked about the children currently living in the household and those not living in the household — and another set are asked to elicit the likelihood of more children arriving in the near future. According to the place of residence of the child, questions on the involvement of the parent not living with the child, if applicable, are registered; child care and financial aspects are major topics investigated with this questionnaire. A copy of the fertility section of the HILDA survey is attached in Appendix A. Fertility can be studied according to different approaches: a retrospective longitudinal approach (2) that looks at the past fertility of the different women and a cross-sectional approach (1) that looks at fertility at a point of time. Figure 3: Different approaches which could be used to analyse fertility with the HILDA data Age o n n n n w ave 1 w ave 2 w a ve3 w a ve4 Year With the retrospective approach, for each woman, we can reconstitute her fertility history and, by aggregating the individual records by birth cohort, we can see the trends that affected the fertility level of earlier decades and can have a better insight into the future (as we can compare the behaviour of different cohorts at the same age and deduce from that what is likely to happen in the future). On the other hand, with the cross-sectional approach, we have the current fertility patterns — and the design of HILDA is such that the key questions about economic characteristics and labour force participation are only available at the date of the interview, not for many years earlier as with the retrospective perspective. 28 APPSIM Working Paper no 5 5.2 Fertility in HILDA As already mentioned, it is important that the projections performed with APPSIM give coherent results, at both cross-sectional and longitudinal levels. Our estimates must be as close as possible each year to the right total fertility rate, so that we will have the right number of births — but we also have to pay attention to the completed total fertility rate (that is, the fertility at birth cohort level). These two figures are both averages — but another important aspect of fertility modelling at an individual level is to obtain a good distribution of the number of children per woman. One aim of APPSIM is to study ageing, and in ageing we can also include disability and need of care, formal or informal. We know that informal care is provided mainly by the spouse and the children, so it is crucial to have a good representation of the distribution of children and, in particular, childless persons. Childless persons and those with no disability-free spouse or children rely almost entirely on formal care. The timing of childbearing can also be of importance, for labour force participation of women, for education cost, and for caring when old age of the parents requires it. 5.3 Fertility estimates 5.3.1 Fertility estimation processes To estimate fertility, we follow an approach similar to the SAGE model as we have a similar dataset (Scott, 2003). From its fertility module, HILDA allows us to use variables from both retrospective variables and current status, or status last year, as predictors of whether the woman will have given birth by the time of the next annual interview. HILDA combines both a retrospective approach at first interview and a panel approach. From HILDA, a person-year file was created using records from 2001 to 2004. To generate a sufficient sample size these pooled records were then treated as independent. Table 3 gives the number of births registered in each wave. The eligible sample was restricted to records for women aged between age 15 and 49. Women who had a child that died have been dropped from the analysis because as the date of birth of this child is unknown and therefore some of the intervals between births APPSIM. Modelling fertility and mortality 29 are incorrect and these cannot be identified3. Cross sectional weights provided with the HILDA data were applied to the pooled records to adjust for unequal probabilities of response. 3 This limit will be able to be removed in the future, as in a future wave of HILDA, questions on children who have died will be asked. 30 APPSIM Working Paper no 5 Table 3: Number of births by parity in the different waves of HILDA. Number of 1st Number of 2nd Number of 3rd Number of 4th births births births and All births subsequent births Wave1 88936 (92) 90262 (85) 38902(45) 19493(28) 237593(250) Wave2 88194(68) 86100(76) 30499(34) 20199(25) 224992(203) Wave3 134581(95) 110548(86) 34482(34) 24817(18) 304428(233) Wave4 107136(98) 79453(61) 36477(31) 25716(21) 248782(211) All 4 waves 418847 366363 140361 90225 Note: The figures within parentheses are the unweighted figures. Due to the retrospective recollection of the data, births are registered after they occur, whatever the approach chosen. This is obvious for retrospective longitudinal approaches as we reconstitute the fertility biography of the women interviewed in 2001 for instance, but it is also the case even for the panel and cross-sectional approaches. In wave x, the births we observed are those that have occurred since the previous wave — e.g. the children aged 0 in wave 3 are the babies born between the 30th June 2002 (wave 2) and the 30th June 2003 (wave 3)4. Therefore, to determine the probability of having a child, we need to use the situation of the woman at the 30th June previous to her childbearing — i.e. for the mother of the baby born between wave 2 and wave 3, we have to use the value of the variables at wave 2. This means that, as illustrated in the Figure below, for the babies born in wave 1 (2001), we have to reconstitute the variables as if there will be a wave “0”5. The first version of the fertility estimates developed here are mainly demography based — that is, mainly demographic variables are used along with quite simple 4 Some corrections to the file have been performed in order to create the age and number of children at the 30th June previous to the interview, rather than at the date of the interview. 5 For the first cut of the estimates, we assume that the value of variables education, labour force status and marital status remain the same in wave 0 as their values were in wave 1. This is a strong assumption, in particular for young persons, but it will be corrected in the future. APPSIM. Modelling fertility and mortality 31 education and labour force status variables. A future model could include more detailed demographic variables (such as duration since union) and some socioeconomic context variables (like education and labour force status of the spouse, level of income, effect of being in a new relationship, duration of cohabitation with partner, etc). This first version will also rely only on a cross-sectional approach, using limited duration variables. A second version could try to capture more of the longterm trends that cannot be replicated with a 4 year long panel. One constraint for the first version is to use in the estimation variables that are available in the 1 per cent household census sample, or relatively easy to impute. (The 1 per cent census sample is the core dataset chosen as the starting population of the microsimulation model APPSIM.) To produce the probabilities of having a child during the following year, a set a logistic regressions have been performed. These regressions have been used to check the significance of the variables and to determine the parameters of the different covariates. A different equation is estimated by parity — and, for the probabilities of having a first or a second baby, a second level of stratification of whether the women is living with a partner or not has been introduced. Table 4 gives the different variables and covariates chosen and tables 5 - 10 give the results of the logistic regressions. The variables included in the estimation are demographic variables, education variables and labour force variables. 32 APPSIM Working Paper no 5 Table 4: Variables and categories used in the regression models Name of the variable Categories Age Single year of birth Migrant status Born in Australia (0) Born overseas (1) Current marital status Married (1) De facto (2) Separated, Divorced, Widow(ers) (3) Never married (4) Couple Not living with a partner (0) Living with a partner (1) Education level High (post graduate-master, doctorate; grad. diploma, grad certificate; adv diploma, diploma; cert III or IV) (1) Low (cert I or II; cert not defined; year 12; year 11 or below; undetermined) (2) Labour force participation status Full-time (1) Part-time (2) Not working (3) Full-time student (4) Duration since last birth (is the equivalent of the age of the youngest child) Years Age of the mother at 1st birth 15-24 years old (1) 25-34 years old (2) 35 years old and over (3) APPSIM. Modelling fertility and mortality 33 5.3.2 Results of the fertility estimation Table 5: Coefficients of the logistic regression of giving birth to the first child for childless women not living with a partner Name of the variable Categories Intercept Age Single year of birth Age2 Parameter estimate Level of significance -6.7438 *** 0.1384 *** -0.00293 *** Migrant status Born in Australia Born overseas 0.7998 *** Current marital status Sep., Div., Wid. Never married 0.3547 *** Education level High Low -0.1862 *** Labour force participation status Full-time Part-time Not working Full-time student -0.4925 0.7315 1.9437 *** *** *** Note: (1) *** for <0.0001; ** for <0.001; * for <0.01; the reference categories are in bold Table 6: Coefficients of the logistic regression of giving birth to a first child for women living with a partner. 34 Name of the variable APPSIM Working Paper no 5 Categories Intercept Age Single year of birth Age2 Parameter estimate Level of significance -10.7588 *** 0.6681 *** -0.0119 *** -0.1576 *** 0.6062 *** Migrant status Born in Australia Born overseas Current marital status Married De facto Education level High Low -0.0925 *** Labour force participation status Full-time Part-time Not working Full-time student -0.5542 -0.3247 1.6710 *** *** *** Note: (1) *** for <0.0001; ** for <0.001; * for <0.01; the reference categories are in bold APPSIM. Modelling fertility and mortality 35 Table 7: Coefficients of the logistic regression of giving birth to a second child for women having one child not living with a partner Name of the variable Categories Intercept Age Single year of birth Age2 Migrant status Born in Australia Born overseas Current marital status Sep., Div., Wid. Never married Education level High Low Labour force participation status Full-time Part-time Not working Full-time student Duration since last birth) Years Parameter estimate Level of significance -3.4698 *** 0.1325 *** -0.00393 *** 0.2362 *** -0.0509. *** 0.0157 * -1.008 -0.5016 0.8455 *** *** *** 0.0569 *** Note: (1) *** for <0.0001; ** for <0.001; * for <0.01; the reference categories are in bold 36 APPSIM Working Paper no 5 Table 8: Coefficients of the logistic regression of giving birth to a second child for women having one child and living with a partner Name of the variable Categories Intercept Age Single year of birth Age2 Parameter estimate Level of significance -9.8651 *** 0.5310 *** -0.00904 *** Migrant status Born in Australia Born overseas 0.1230 *** Current marital status Married De facto 0.3015 *** Education level High Low 0.1142 *** Labour force participation status Full-time Part-time Not working Full-time student -0.0247 -0.6572 0.6814 *** *** *** Duration since last birth) Years -0.0180 *** Age of the mother at 1st birth 15-24 years old 25-34 years old 35 years old and over 0.4307 0.0959 *** *** APPSIM. Modelling fertility and mortality 37 Note: (1) *** for <0.0001; ** for <0.001; * for <0.01; the reference categories are in bold Table 9: Coefficients of the logistic regression of giving birth to a third child for women already having two children Name of the variable Categories Intercept Age Single year of birth Age2 Parameter estimate Level of significance -12.0809 *** 0.6435 *** -0.0119 *** Migrant status Born in Australia Born overseas 0.2748 *** Current marital status Married De facto S Di 0.2415 0.3756 0 5683 *** *** *** 0.0279 *** -0.2703 0.3276 0 4466 *** *** *** Education level High Low Labour force participation Full-time Part-time N ki Duration since last birth) Years Wid 0.1410 *** Age of the 15-24 years old -0.2804 st mother at 1 25-34 years old 0.0313 35 ld d bi h Note: (1) *** for <0.0001; ** for <0.001; * for <0.01; the reference categories are in bold 38 APPSIM Working Paper no 5 Table 10: Coefficients of the logistic regression of giving birth to an additional child for women having three or more children Name of the variable Categories Intercept Age Single year of birth Age2 Migrant status Born in Australia Born overseas Current marital status Married De facto Sep., Div., Wid. Never married Education level High Low Labour force participation status Full-time Part-time Not working Full-time student Duration since last birth) Years Parameter estimate Level of significance -4.2654 *** 0.2043 *** -0.00581 *** 0.7998 *** -0.5091 0.4869 0.0745 *** *** *** 0.3186 *** -0.3506 -0.0344 1.1121 *** *** *** 0.0571 Note: (1) *** for <0.0001; ** for <0.001; * for <0.01; the reference categories are in bold APPSIM. Modelling fertility and mortality 39 5.4 Calibration/Alignment This topic is more fully developed in another paper (Bacon, 2007). We note again the aggregates and benchmarks that we should be able to ‘hit’ with the fertility estimates in the APPSIM model The number of births ought to align with the aggregate number of births based on fertility rates. In theory, it is desirable that the model outcomes also align with external benchmarks by age of the parents, number of births by the mother (first child, second child, etc), marital status of the parents, and aggregate number of multiple births (sets of twins, etc). The fertility rate should reflect historical rates where available and the projected rates (general fertility rates and age-specific rates) should be a user-defined input parameter. Different sources of information will be used in order to check and calibrate our estimates. Among these are the vital statistics, from surveys such as the ABS survey on family characteristics and the ABS population projections. 5.5 Implementation in APPSIM Each women aged 15 to 49, will enter in the fertility loop. Her probability of having a child each year is related to a set of covariates — the number of children she has ever had (parity), her age, education level, labour force participation status, migration status, and for parity 2 and 3, her age at the first birth. This implementation can be drawn as follows: Once a women is set to have a birth within the year, some characteristics of the outcome of the confinement have to determined, such as whether the women gives birth to one or more children. Next, some characteristics of the newborn have to be determined; their birth cohort is deduced from the year the birth arises, but the sex of the newborn needs to be determined and also whether the newborn will survive to his or her first birthday. Multiple births 40 APPSIM Working Paper no 5 Once a woman has been selected to give birth in the coming year, we will determine if this confinement leads to multiple births or not. The probability of a woman giving birth to more than one baby is based on the average rate of multiple births observed in Australian in 2000-2005. The maximum number of multiple live births that can be allocated is triplets. As in other countries, women aged 30 and over are more likely to have multiple births than younger women. This is related to the ageing of childbearing and the more frequent use of medically procreative fecundation at these ages. On average 1.7 per cent of confinements lead to more than one child. Women aged 30 and over are 1.5 times more likely to have twins or triplets than younger women. In the model, for those having more than one child, 97.9 per cent will have twins and 2.1 per cent triplets. Figure 4: Model of child bearing Women aged 15-49 how many children ever had ? 0 1 2 birth? birth? birth? age, parity, marital status, labour force participation status, migration status, education level age, parity, marital status, labour force participation status, migration status, education level, duration since previous birth, age mother at 1st birth age, parity, marital status, labour force participation status, migration status, education level, duration since previous birth, age mother at 1st birth no yes no 3 birth? age, parity, marital status, labour force participation status, migration status, education level, duration since previous birth no yes yes Single or multiple birth in the confinement? age one multiple twin or triplets? sex of the newborn Infant mortality age, sex no yes APPSIM. Modelling fertility and mortality 41 Sex of the infant The allocation of the sex of a child follows the sex ratio at birth in Australia of around 0.512 (105 boys for 100 girls) male to female babies. This ratio will be replicated in APPSIM, with this choice being influenced by the fact that APPSIM is a national model which doesn’t incorporate the State level. (If State and Territory were included within APPSIM, then the sex ratio should be state-related, as there are differences according to states in the sex ratio.) Infant mortality The probability of death between the birth and the end of the projection period (here the financial year) is given in the projected ABS population projection by sex. The death rate is 4.92 per cent for boys and 4.59 per cent for girls in 2004. 6. Conclusion As with any module in APPSIM, this demographic module is under continuing development. This document gives the first cut of the demographic module. As it is the first to be built it will be fine tuned and re-estimated as new modules are developed and as new data are available. In particular, we will use the new waves of the HILDA panel survey to reassess fertility estimates. 42 APPSIM Working Paper no 5 References Abello Annie, Kelly Simon, King Anthony. 2002. Demographic projections with Dynamod-2. Canberra, NATSEM, University of Canberra, 21, 50 p. ABS. 2004. Births Australia. Cat. 3301.0 www.abs.gov.au ABS. 2004. Population projection Australia. Cat. 3222.0 www.abs.gov.au Bacon Bruce, Pennec Sophie .2007. Calibration and alignment in APPSIM, mimeo, NATSEM. Duée Michel, Rebillard Cyril, Pennec Sophie. 2005. Les personnes dépendantes en France: Evolution et Prise en charge. IUSSP website (www.iussp.org) International Union for the scientific study of the population. XXV International Population conference., Tours (France), 18-24 juillet 2005. Duée Michel. 2005. La modélisation des comportements démographiques dans le modèle de microsimulation DESTINIE. Paris, G 2005 / 15, 47 p. Favreault Melissa, Smith Karen. 2004a. A Primer on the Dynamic Simulation of Income Model (DYNASIM3). The Urban Institute. Favreault Melissa, Smith Karen. 2004b. A Primer of the Dynamic Simulation of Income Model (DYNASIM3). The Urban Institute, 22 p. Flood Lennart, Jansson Fredrik, Pettersson Thomas, Sundberg Olle, Westerberg Anna. 2005. "SESIM III - a Swedish dynamic microsimulation model." Sweden: Ministry of Finance. Fredriksen Dennis. 2003. The MOSART model - a short technical documentation. In: Norway Statistics Paper presented at the International Microsimulation Conference on Population, Ageing and Health: Modelling our Future, 7-12 December. Canberra. Hammel Eugene A., Wachter Kenneth W., McDaniel C. K. 1981. The kin of the aged in 2000 A.D. In: Morgan James, Oppenheimer Valerie, Kiesler Sara New York, Academic Press, vol. 2 -Social Change, p. 11-39. Hammel Eugene A, Mason Carl, Wachter Kenneth W. 1990. SOCSIM II. A sociodemographic Microsimulation Program Rev. 1.0. Operating manual. 29, p 76(5) Harding Ann. 1993. Lifetime Income Distribution and Redistribution: Applications of Microsimulation Model. Amsterdam, North-Holland. Harding Ann , Gupta Anil (eds). 2007. Modelling our future: Population Ageing, Social Security and Taxation. Amsterdam, North-Holland. Holm Einar, Holme Kirsten, Makila kalle, Mattson-kaupi Mona, Mortvik gunnel. The Sverige microsimulation model - content, validation and example applications. Kulturgeografiska Institutionen/SMC - university of Umea, 54 p. APPSIM. Modelling fertility and mortality 43 Hyrenius Hannes, Adolfsson Ingemar. 1964. A fertility simulation model. Göteborg, Distributor: Almqvist & Wiksell, 31 p. p. Hyrenius Hannes, Adolfsson Ingemar, Holmberg Ingvar. 1966. Demographic models. Göteborg, v. p. Imhoff Evert Van, Post Wendy. 1997. Methodes de micro-simulation pour des projections de population. Population (French Edition), 52 (4, Nouvelles approaches methodologiques en sciences sociales), p. 889-932. Kannisto, Väinö Development of Oldest-Old Mortality, 1950-1990: Evidence from 28 Developed Countries. Odense University Press, Odense, 1994; ISBN: 87 7838 015 4. Keefe Janice, Légaré Jacques, Carrière Yves. 2004. Projecting the future availability of informal support and assessing its impact on home care services, Part I: Demographic projections. Health Canada. Halifax: Mount Saint Vincent University. Le Bras Hervé. 1973. Parents, grand-parents, bisaïeux. Population, 28 (1), p. 9-38. Léridon Henri. 1977. Human fertility: the basic components. Chicago, University of Chicago Press, 202 p. Léridon Henri. 2004. Can ART compensate for the natural decline in fertility with age? A model assessment. Human Reproduction 19(7): 1548-1553, July 2004. McDonald Peter. 2005. Has the fertility rate stopped falling. People and Place, 13 (3), p. 1-5. O' Donoghue Cathal. 2001. Dynamic Microsimulation: A Methodological Survey. Brazilian journal of economics, 4, p. c 77. Orcutt Guy H. 1961. Microanalysis of socioeconomic systems; a simulation study. New York, Harper, xviii, 425 p. p. Pennec Sophie. 1997. Four-Generation Families in France. Population an English Selection, 9, p. 75-100. Scott Anne. 2003. Implementation of demographic transitions in the SAGE Model. London, 5, 54 p. Sheps Mindel C., Menken Jane A., Radick Annette P. 1973. Mathematical models of conception and birth. Chicago, University of Chicago Press, xxiii, 428 p. p. Smith James D. 1987. The computer simulation of kin sets and kin counts. In: Bongaarts John, Burch Thomas K., Wachter Kenneth W. Family Demography: Methods and their applications. Oxford, Clarendon Press. Spielauer Martin, Vencatasawmy Coomaren P. 2003. FAMSIM: dynamic microsimulation of life course interactions between education, work, partnership formation and birth in Austria, Belgium, Italy, Spain and Sweden. In: Vienna yearbook of population research. p. 143-164. Statistics Canada. Canada Statistics. 2006. The Lifepaths Microsimulation Model: An Overview. Dernière modification. 44 APPSIM Working Paper no 5 Ruggles Steven. 1987. Prolonged connections: The rise of the extended family in nineteenthcentury England and America. Madison; London, The University of Wisconsin Press, 283 p. Tomassini Cecilia, Wolf Douglas A. 2000. Shrinking Kin Networks in Italy Due to Sustained Low Fertility. European Journal of Population, 16 (4), p. 353-372. APPSIM. Modelling fertility and mortality Appendix A: Fertility Questions in HILDA 45 46 APPSIM Working Paper no 5 APPSIM. Modelling fertility and mortality 47 48 APPSIM Working Paper no 5 APPSIM. Modelling fertility and mortality 49 50 APPSIM Working Paper no 5 APPSIM. Modelling fertility and mortality 51