Abstract
We present a novel dynamic microsimulation model that undertakes stochastic transition modelling of a rich set of developmental, economic, social and health outcomes from birth to death for each child in the Millennium Birth Cohort (MCS) in England. The model is implemented in R and draws initial conditions from the MCS by re-sampling a population of 100,000 children born in the year 2000, and simulates long-term outcomes using life-stage specific stochastic equations. Our equations are parameterised using effect estimates from existing studies combined with target outcome levels from up-to-date administrative and survey data. We present our baseline projections and a simple validation check against external data from the British Cohort Study 1970 and Understanding Society survey.
1 Introduction
In this paper we introduce a forward-looking dynamic childhood policy microsimulation model “LifeSim” which models the co-evolution of many economic, social and health outcomes from birth to death for each child in a general population birth cohort of 100,000 English children born in 2000-1. In addition to modelling the individual outcomes, LifeSim also models the costs and savings to the public budget associated with these outcomes. The version of LifeSim presented in this paper focuses on conduct problems, conduct disorder and cognitive skills as the core childhood outcomes, and hence is most useful for analysing childhood policies that drive these outcomes. However, the model can be easily extended to analyse policies with direct effects on a wide range of other childhood outcomes.
Outcomes up to age 14 which include child-specific variables (e.g. the child’s sex, cognitive skills, conduct problems) and family-specific variables (e.g. parental income, parental education, parental mental health) are primarily based on data from the Millennium Cohort Study (MCS). The dynamic year-by-year co-evolution of later life outcomes is modelled using life-stage-specific equations representing stochastic transition pathways. These equations were parameterised using transition pathway estimates from previous peer-reviewed studies based on data representative of the British cohort that we simulate. The structure of our model – a set of causal pathway networks in early childhood, school age, working age and retirement – was designed to align with the large body of theory and knowledge about human capital formation in childhood and later life economic and health outcomes.
Policymakers have indicated a strong need for better childhood policy simulation models (Fe- instein, Chowdry and Asmussen, 2017; Allen, 2011; Dalziel, Halliday and Segal, 2015). As a response, LifeSim has several new features that can contribute to more informative childhood policy analysis. More specifically, LifeSim:
jointly models the co-evolution of many economic, social and health outcomes, capturing how outcomes in multiple domains interact, compound and cluster over time, emphasising how early-life disadvantages can compound over life creating a spiral of multiple disadvantage;
simulates long-run outcomes for a whole general population cohort of children, not just one specific subpopulation of trial participants, allowing more informative policy analysis, including optimal policy targeting analysis, population-wide distributional impact analysis and assessment of the opportunity costs falling on the individuals not directly affected by the intervention;
simulates individual-level outcomes for each heterogeneous child in the cohort, instead of only producing average-level outcomes, allowing us to produce multidimensional individual wellbeing measures, which have been discussed in the literature and have well-known advantages over unweighted cost-benefit analysis (Adler and Fleurbaey, 2016);
simulates outcomes over the whole lifecourse from birth to death, enabling policy analysis to adopt a broad lifetime perspective;
is forward-looking and therefore relevant to drawing conclusions about the long-term consequences for cohorts living in the present, rather than historical cohorts born many decades ago that are not as relevant anymore to the current childhood policy context.
The price of all these advantages lies in making numerous strong assumptions in order to combine multiple sources of data. We believe this is a price worth paying to provide decision makers with useful policy insights, and that it is better to make such assumptions explicit and subject to scrutiny rather than to leave them implicit. We use longitudinal data on children born in 2000 as our primary data source but supplement this with many other sources of data including more up-to-date cross-sectional administrative and survey data as well as older sources of longitudinal data on children born in earlier decades. In choosing how many assumptions to make and how many sources of data to use, there are trade-offs between internal and external validity.1 Using a single source of experimental data with long-term follow-up over many decades would maximise internal validity, but is only possible for backward-looking evaluation of policy experiments many decades ago. Using assumptions and multiple sources of data is necessary to achieve external validity for forward-looking economic appraisal of current policy options in the current policy environment.
Since our model is designed for the purpose of policy analysis rather than forecasting, the most important criteria for model credibility arguably relate to the quality of the underlying conceptual framework and data sources rather than ability to predict external data sources or future trends (Kopec et al., 2010). Nevertheless, we provide a simple comparison of our simulation with external data. First, we provide a comparison with data from the 1970 Birth Cohort Study up to age 46. We find that our simulation is broadly consistent with the external data and substantially divergent when appropriate – for example, our simulation for people born in 2000 has a much lower proportion of people smoking than the 1970 cohort, reflecting the reduction in smoking rates in the UK since the 1970s. Also, our simulation for people born in 2000 has a much larger proportion with young people having obtained a university degree at age 26 than the 1970 cohort at that age, reflecting the massive expansion in university provision in the UK since the 1970s.
We also provide a comparison with recent external cross-sectional dataset – Understanding Society (in year 2016). Our simulated earnings outcome replicates reasonably well the sex-age specific distributions observed in the Understanding Society data. Also, for our simulated key discrete outcomes – including health-related outcomes and unemployment – the sex specific prevalence trends against age are not too deviant from the trends observed in the Understanding Society data. Any minor discrepancies can be explained by differences in data collection methods for Understanding Society and our target datasets.
Finally, we provide an additional check of LifeSim output against the various target datasets that we directly use to calibrate our equations, such as Health Survey for England, and Office for National Statistics datasets. As expected, our simulated outcomes match very well the trends and patterns observed in the target data. Because our model is flexible and can be used together with many data sources, if needed, one can easily substitute our target datasets with alternative datasets, to match the trends and patterns observed in these alternative sources.
The rest of the paper proceeds as follows. Section 2 outlines the methods. Section 3 summarises our baseline simulation results. Section 4 provides a simple comparison of our simulation with existing datasets. Section 5 discusses and concludes. Additional details to this material can be found in the supplementary appendices.
2 Methods
2.1 Model Structure
To keep things simple, we focus on a single general population birth cohort of 100,000 children, rather than seeking to model the entire all-age population. We draw heavily on a recent longitudinal survey – the Millennium Cohort Study (MCS) of English children born in 2000-2001 – both to describe the initial characteristics of the cohort and to model most of the childhood outcomes up to age 14. After age 14, we model outcomes using equations consistent with our model structure and a set of key principles to combine quasi-experimental evidence and external sources of target data.
The model links together a diverse set of individual-level life outcomes of interest to policymakers (see Figure 1). In choosing the model outcomes and formulating the model structure we consulted with experts in childhood development and childhood policy, demography, epidemiology, human capital economics and labour economics (see list of advisory group members in the acknowledgements) and were also guided by inter-disciplinary theory on human capital formation in childhood and how this influences educational attainment, earnings, physical illness, mental illness, mortality and other outcomes with important impacts on individual wellbeing and public cost (Almond, Currie and Duque, 2018; Goodman et al., 2015; Nelson et al., 2020; Cunha and Heckman, 2010; Adler and Stewart, 2010; O’Donnell, Van Doorslaer and Van Ourti, 2015; Layard et al., 2014; Shonkoff, 2010; Black et al., 2017).
The model structure changes as individuals progress through key life stages. In each life stage, the dependencies between the initial conditions and the life-course outcomes are represented by a model structure diagram (e.g. Figure 2 and Figure 3). Each solid arrow in these diagrams is modelled using equations, as we will explain in Section 2.3.
LifeSim also models variables relevant to public budget (Figure 4). This includes modelling the public costs over time associated with certain life outcomes, such as conduct disorder, being in prison, mental illness, coronary heart disease, as well as cash benefits paid to people who are in poverty and/or unemployed. This also includes modelling the taxes paid over time on individual earnings and financial gains. These can be aggregated, to assess the overall impact on the public budget as well as cost savings under different policy scenarios and over various time spans. Details of the evidence and assumptions about the unit costs of public services and our simple approach to modelling long-run taxes and benefits are found in the Appendix A.
2.2 Childhood Survey Data
We use random draws of individuals from childhood survey dataset – MCS – to create distributions of the initial conditions, as well as the child’s cognitive skills and conduct problems measures for the simulated cohort of 100,000 individual (see Table 1 for definitions and summary statistics). We re-sample the MCS dataset using sampling weights and random observations with replacement.
We measure conduct problem severity during childhood using the parent-reported Strengths and Difficulties Questionnaire (SDQ) conduct problem subscale score, reported in the MCS. This score ranges from 0-10, with a higher score representing more conduct problems.
We then model the actual child’s probability of developing a conduct disorder using a more sophisticated predictive algorithm based on a combination of the SDQ conduct problem score and a further parent-reported “behavioural impact” score, which provides a specific probability of conduct disorder based on a classification as either “possible” or “probable” (Goodman et al., 2003; Goodman, Renfrew and Mullick, 2000). This modelled probability is then combined with a random draw from a uniform distribution over 0-1, which allows to simulate the discrete outcome of whether a child develops a conduct disorder or not.
Our cognitive skills measure is an age-specific common factor extracted from the cognitive skills measures available in MCS, including the British Ability Scales II (for ages 3, 5, 7, 11), Bracken School Readiness Assessment (for age 3), National Foundation for Educational Research (NFER) Progress in Maths (for age 7), Cambridge Neuropsychological Test Automated Battery tests (for ages 11 and 14) and Applied Psychology Unit (for age 14). We extract a common factor for each age where test results are available using principal component analysis, and standardise it to be with a mean of 1.00 and standard deviation of 0.15 (following Jones and Schoon (2008)).
More details on the conduct problems and skills measures can be found in Appendix A.
The other simulated characteristics of children and their parents are summarised in Table 1. It should be noted that any other variable of interest that is reported in MCS or can be linked to MCS, can be easily added to LifeSim.
2.3 Modelling Later Life Outcomes
2.3.1 Parameters
To model later life outcomes, we use equations which we: (i) calibrate using target data from observational studies, which describe expected levels of and associations between variables at a point in time; (ii) parametrise using effect estimates, which attempt to draw inferences about the effect of one variable on another variable, either at the same time or a future point in time. Table 2 summarises the target datasets that we use. Table 3 summarises the determinants of the modelled outcomes, as well as the parameter sources used, if applicable. More details are found in the Appendix A.
Our target data comes from the most up-to-date and nationally representative available surveys and administrative records in England. Our effect estimates come from studies based on longitudinal data in a UK context, unless robust estimates are only available from other high-income countries. Where possible, we try to use causal inference studies, e.g. based on quasi-experimental design. Using estimates based on past cohorts of individuals relies on the assumption that historical cohorts are a reliable proxy for the modelled cohort.
2.3.2 Modelling Equations
Most of our equations can be described as one of the following: (i) simple level equations based on target data only; (ii) complex level equations based on target data supplemented with effect estimates; (iii) simple difference equations based on target data only; (iv) complex difference equations based on target data supplemented with effect estimates. We illustrate each below in turn with a simple example.
Level Equations
To model the individual probability of dying, the simplest approach is to use historical mortality rates:2 where is the mean probability of dying conditional on age, sex and English index of multiple deprivation (IMD) quintile group, calculated using a target dataset such as the Office for National Statistics mortality data (see Table 2). We denote means from a target dataset using an overline.
We can also supplement equation (1) with effects estimates. For example, we may wish to model that coronary heart disease (CHD) increases one’s probability of dying by a certain proportion (denoted by ). In this case, we use: where chdi,age is the simulated binary outcome of individual i having a CHD at a certain age, is the mean CHD prevalence given age, sex and IMD quintile group from a target dataset. Notice that we subtract the mean CHD prevalence from the simulated CHD outcome to avoid double counting, as the term is not independent from CHD, but the variable CHD is not observable in the ONS mortality target dataset, so we cannot directly condition the target mortality mean on the CHD status. After multiplying each term in the brackets by the beta coefficient, it can be seen that our approach is equivalent to subtracting the ‘population attributable risk’ from the risk of the simulated individual (Webb, Bain and Page, 2016).
Difference Equations
If a level of a variable is already known, we can proceed by modelling the evolution of a variable as a difference from the previous time period. For example, when the level of earnings has been established at age 19 (the start of ‘working years’ life stage), we can model the change in individual earnings during the subsequent periods as:
Where Δearningsi,age = earningsi,age − earningsi,age−1 is the change in earnings from the previous year, and is a trend that governs the changes in earnings over time, calculated from a target dataset on earnings by age and sex.
Similar to level-equations, we can supplement equation (4) with an effects estimate. For example, to model that developing a depression reduces earnings by a certain level represented by we use: where depressedi,age is an indicator of an individual having a depression at a given age and Δdepressedi,age = depressedi,age − depressedi,age−1 More details on the modelling equations are found in the Appendix A.
2.3.3 Wellbeing Summary Measure
Conventional methods of unweighted benefit-cost analysis can be criticised on two important grounds. First, by focusing on unweighted consumption they ignore the well-established concept in economics of diminishing marginal value of consumption; second, they provide no information about the social distribution of costs and benefits and their impact on inequalities (see discussion in Cookson et al. (2020)). There is a large literature on the theoretical and practical shortcomings of unweighted cost-benefit analysis and the advantages of alternative utilitarian and prioritarian approaches to economic evaluation based on explicit individual wellbeing and social welfare functions (Adler and Fleurbaey, 2016).
Our framework generates individual-level outcomes that could be used in many different ways to create summary indices of wellbeing for use in economic evaluation. In our illustrative evaluation we follow Cookson et al. (2020) who propose a simple approach based on the quality-adjusted life year (QALY) concept in health economics but adjusting for consumption as well as health-related quality of life. They refer to this as an “equivalent life” approach (Canning, 2013), and the resulting wellbeing metric as “years of good life” or “wellbeing QALYs”. We represent individual wellbeing in year t by a function wt() increasing in both consumption and health. More specifically, w(..) = healthi,age + u(consumptioni,age) where u(.) is a standard isoelastic utility of income function defined as . The parameter η > 1 captures diminishing marginal value of income, and A and B are constants which depend on normative parameters: η (already mentioned), minimal consumption for a life worth living and standard consumption for a good life. In the current application we set minimal consumption at £1,000 (estimated amount required to buy basic food supplies in the UK for a year) and standard consumption at £24,000 (the mean consumption in the LifeSim simulated cohort), and η = 1.26 (see Cookson et al. (2016)).
The interpretation is that a good year is a year lived enjoying full health and consuming the equivalent of the average consumption in a rich country. The good-years measure is more informative than conventional monetary measures because it takes into account the notion that one pound of additional consumption is worth substantially more to a poor individual than a rich individual. Our approach could be used to construct many other multidimensional measures of wellbeing that have been proposed in the literature, including equivalent income measures and measured based on life satisfaction (Adler and Fleurbaey, 2016).
2.4 Computing Methods
LifeSim is implemented in software R (tested on R version 3.6.2) using object-oriented programming for R (requires R6 and tidyverse packages). The code and related data files compressed in a zip-file ‘LifeSim.zip’ can be extracted and run on a high performance computing (HPC) cluster (Slurm Workload Manager).
When we split the simulation into 500 parts, it takes 28 minutes to run it on the HPC cluster. The simulation can also be run on a standard PC, for any chosen number of individuals.
The current code is written in a ‘user-friendly’ object-oriented way, allowing to easily add additional variables of interest. However, it would be possible to speed up the code by vectorising the simulation at the cost of making it less user friendly.
3 Baseline Results
In this section we show our baseline simulation results, and demonstrate some formats in which they can be analysed.
Table 4 provides key summary statistics for the simulated outcomes, including child outcomes, adult outcomes and final wellbeing outcomes. We show means, standard deviations, and the minimum and maximum value of an outcome in the total distribution of the simulated individuals. Table 4 does not present the summary statistics of the the initial conditions, as well as the child’s cognitive skills and conduct problems measures that we obtain from the childhood survey dataset (MCS), as these variables have already been summarised in Table 1.
Approximately 9% of 18 year-old adults develop conduct disorder in the LifeSim simulation. This estimate fits within the range of 1-10 %, commonly reported in the epidemiology literature on conduct disorder (see a review in Hinshaw and Lee (2003), also Patel et al. (2018)). Our estimate, however, slightly exceeds the 8% of young men and 5% of young women with conduct disorder estimated by Mental Health of Children and Young People in England survey in year 2017. This small difference may be caused by the fact that the algorithm by Goodman et al. (2003); Goodman, Renfrew and Mullick (2000) that we use to simulate conduct disorder incidence is based and validated on child samples attending child mental health clinics, and it may overestimate the actual conduct disorder prevalence in the general population. On the other hand, conduct disorder diagnosis in the clinic sample can be argued to be more precise and sensitive than in the survey data sample, because in the clinic sample diagnosis was made by mental health specialists using detailed information on symptoms and resultant impairments gathered from multiple informants, whereas in the specific survey sample diagnosis was based on a single specific tool – Development and Well-Being Assessment. In conclusion, we also find this difference in conduct disorder prevalence rates small, and our estimate is consistent with the more general findings in the literature and the concern that conduct disorder prevalence is often under-estimated in survey data.
Figure 5 shows the simulated distributions of some core outcomes, which also include the distribution of lifetime wellbeing (measured using the approach by Cookson et al. (2016) described in the section 2.3.3.)
Table 5 shows the average costs to the public budget associated with certain outcomes, cash benefits paid to people who are in poverty or unemployed, as well as taxes on earnings and financial gains. These are calculated over various time intervals over the life-course, and separately for the general population, and then for people born in the lowest and top income quintile groups at birth.
Table 6 provides two summary measures of inequality, based on differences in lifetime expected wellbeing between best off and worst off groups on the basis of the following early childhood circumstances – sex, parental income quintile group (poorest vs. richest 20%), parental mental health, parental education, and high baseline conduct problems (SDQ conduct problem score at age 5 equal to 7 or above). Our “extreme best off group” focuses on individuals in the top category of all four main markers of social disadvantage in early life (top 20% parental income, high parental education, no parental mental illness, high baseline conduct problems). Our “best off 20% group” focuses on the best off 20% of individuals in terms of predicted lifetime wellbeing based on all four main markers of social disadvantage in early life.
4 Comparison With Other Datasets
4.1 Comparison With 1970 Birth Cohort Study
Table 7 compares the LifeSim predictions with data from the 1970 Birth Cohort Study (BCS70) at ages 26, 29, 42 and 46, as a simple validation check. We list the number of observations, means and standard-deviations of the LifeSim variables for children born in the year 2000 and the BCS70 variables for children born in the year 1970, representing the same outcomes. For each outcome, we quantify the difference between the LifeSim distribution and BCS70 distribution in terms of the absolute difference in their means and standard deviations.
We would expect some adult outcomes to be similar (e.g. health) but others to be substantially different (e.g. earnings, rates of smoking and university education), and so this can be seen as a simple validation check to ensure that our model provides broadly similar findings in the same ballpark where appropriate, and substantially different findings where we know different generations had very different experiences e.g. smoking. Nevertheless, most variables do not deviate substantially from the same quantities characterising the cohort born in 1970.
One exception already mentioned is smoking, which is expected and can be explained by the change in smoking rates over time. Another exception is education – the proportion of people with a degree under 30 years old – is much higher in the LifeSim cohort. This can be explained by the change in higher education participation rates over time, and increased equality between the genders in the cohort born in 2000. Over time the 1970s cohort partially catches up with the LifeSim cohort by obtaining qualifications at a later age – at the age 46 the proportion of people with a university degree is more similar in both samples than at the age 26. Finally, the LifeSim earnings at all ages on average exceed the 1970s cohort earnings. This can be explained by cohort effects, such as general differences in economy, society, culture and politics experienced by the two cohorts.
4.2 Comparison With Recent Cross-Sectional Data
To avoid such general cohort effects which arise when comparing two generations born 30 years apart, we also carry out a simple validity check using more recent cross-sectional datasets. More specifically, we compare our age-specific LifeSim outcomes with age-specific outcomes in cross-sectional data.
Figure 6 compares the age-earnings profile for males and females in the LifeSim simulation with our target dataset – ONS Annual Survey of Hours and Earnings in year 2015, and in the Understanding Society survey in year 2015. The concave trend with age, initially increasing and then – decreasing earnings, is very similar in the tree datasets.
Figure 7 compares the earnings distributions by sex and different age groups in the LifeSim cohort and the Understanding Society data. Both distributions have similar medians for the different sex-age groups, and also become more uniform with increasing age. One issue left to be addressed as part of future work is modelling of the relatively longer right hand side tail which can be observed for the Understanding Society data and not for the LifeSim data. This tail represents the highest-earning people in the distribution, and the LifeSim earnings output does not have this tail, as we do not model the outcome of being employed in extremely-high earning jobs. Addressing this feature in LifeSim would require modelling the link with variables in early life that would lead to such extremely-high earning states.
In Figure 8 we compare the prevalence of the different discrete outcomes in LifeSim cohort, and in our corresponding target datasets, which include Health Survey for England for the health-related outcomes, ONS Labour Force Survey for unemployment and Department for Education estimates for participation in higher education.
The simulated outcomes matches the target data well, but there is some small discrepancy with the Understanding Society data, which can be explained by differences how data on similar outcomes is collected across different surveys.
5 Discussion
We present LifeSim – a novel microsimulation model for analysing the long-term consequences of childhood policies. Unlike previous models, LifeSim is capable of modelling a rich set of developmental, social, economic and health outcomes from birth to death for each child in a general population birth cohort of 100,000 English children born in the year 2000-1.
The main strength of our model is that it captures the dynamic individual-level interaction between many outcomes across the social, economic and health domains over the entire lifecourse. Previous models have modelled either only a few individual-level outcomes over part of the life-course, or looked at aggregate-level outcomes only. Simultaneously analysing many outcomes is more informative as it allows capturing how many early life disadvantages can compound over life creating a spiral of multiple disadvantage.
Another strength of LifeSim is that it simulates the long-run outcomes for a whole general population cohort of children, not just a narrow group of trial participants, which allows carrying out more complex and policy-relevant analysis.
Our model is forward looking, which allows analysing the long-term childhood policy consequences for cohorts born now rather than analysing the past policies with consequences for historical cohorts, but which are not as relevant to current childhood policy context.
LifeSim generates long-term individual-level data, which makes it compatible with applying new multidimensional summary indices of wellbeing recently proposed in the theoretical literature (Cookson et al., 2020; O’Donnell et al., 2014; Fleurbaey et al., 2013; Fleurbaey and Schokkaert, 2013). These indices are more informative than conventional monetary valuation based on aggregate outcomes, as they allow to account for the diminishing marginal value of consumption and other sources of heterogeneity in the marginal value of different life outcomes to different individuals. However, application of these indices in practice requires individual level long-term time series data on many outcomes across the health, social and economic outcome domains. Such rich long-term data is difficult to obtain from existing datasets, especially if we are interested in analysing cohorts living in present rather than historical cohorts of people born decades ago. Models such as LifeSim can compile the many data sources together to extrapolate the required individual-level long term outcomes. LifeSim can easily be extended to incorporate additional features. One extension would be to incorporate more outcomes. Our model includes many different categories of human capital (e.g. cognitive skills, social skills, educational attainment, health, employment) but within each category, more nuanced distinctions could be made. Health outcomes are modelled using just three binary variables – mental illness (depression), physical illness (CHD) and mortality – educational outcomes focus only on gaining a university degree; employment outcomes focus only on unemployment not precarious employment; and our modelling of the tax and benefit system and retirement savings is extremely stylised. Similarly, more individual-level factors could be included (e.g. ethnicity), more family-level factors (e.g. child abuse) and more neighbourhood-level factors (e.g. air quality). Also, our tax benefit modelling is highly stylised and could be improved by incorporating a standard static tax benefit calculator, such as Euromod (Suther- land and Figari, 2013).
Another extension would be to produce a more joined-up set of transition probability estimates by conducting a comprehensive re-analysis of longitudinal data, rather than piecing together estimates from existing peer reviewed studies, as set in detail in Appendix A. Specific transition pathway estimates could also be modified in specific cases to strengthen external validity for specific populations. For example, estimates based on long-term outcomes for mostly white children born in the 1970s may not be applicable to Asian British populations. Using external data sources to estimate long-run health effects for Asian British populations would produce more applicable estimates for those populations. Another extension would be to re-calibrate our model to other populations – e.g. the UK in 2025, or England or Scotland, or a sub-national area of England – by updating the initial conditions of the birth population and the external macro target data on average population level outcomes and associations within that birth population in subsequent years.
Finally, our model structure could also be extended in more fundamental ways – for example, to model the all-age population rather than just a birth cohort, to model the dynamics of family formation and dissolution and spillover effects on other family members, and parental investment choices and other behavioural responses.
It should be acknowledged that considering such extensions involves making trade-offs between model complexity and tractability, and in some cases it may be preferable to use other more specialist models and combine the findings from different models, rather than expand an existing model. For example, as already mentioned – our model could be combined with Euromod (Sutherland and Figari, 2013) – the tax and benefit microsimulation model, to generate more comprehensive output on taxes and benefits for the assessment of the consequences to the public budget.
Overall, LifeSim is a flexible and policy-relevant model which can be easily implemented to carry out long term childhood policy analysis. New variables of interest, in particular childhood variables, can be easily incorporated within LifeSim, and the input datasets can be updated, as required.
Data Availability
All the data referred to in the manuscript is publicly available data.
Acknowledgements
We would first like to thank the members of our advisory group: Annalisa Belloni, Sarah Cattan, Leon Feinstein, Paul Frijters, Peter Goldblatt, Heather Joshi, Catherine Law, Lara McClure and Christine Power.
For useful comments we also are grateful to Shehzad Ali, Mark Ashworth, Karen Bloor, Laura Bojke, Eva Maria Bonin, Jonathan Bradshaw, Penny Breeze, Alan Brennan, Eric Brunner, Tracey Bywater, Simon Capewell, Maria Guzman Castillo, Bette Chambers, Brendan Collins, Gabriella Conti, Peter Diggle, Tim Doran, Susan Griffin, Nils Gutacker, James Heckman, Bruce Hollingsworth, Andrew Jones, Noemi Kreif, Christodoulos Kypridemos, Richard Mat-tock, Cheti Nicoletti, Martin O’Flaherty, Kate Pickett, George Ploubidis, Gerry Richardson, Jemimah Ride, Matthew Robson, Tracey Sach, Filipa Sampaio, Trevor Sheldon, Tushar Srivastava, Mark Strong, David Taylor-Robinson, Valentina Tonei, Aki Tsuchiya, Simon Walker, Margaret Whitehead and Mark Mon Williams.
The errors and opinions expressed in this paper are our own.
Footnotes
This is independent research supported by the National Institute for Health Research (SRF-2013-06-015) and the Wellcome Trust (Grant No. 205427/Z/16/Z). The authors have no other conflicts of interest to report. The views expressed in this publication are those of the authors and not necessarily those of the National Institute for Health Research, the Wellcome Trust, the NHS or the Department of Health and Social Care.
↵1 Internal validity relates to claims about cause and effect within the study population, whereas external validity relates to how applicable the findings are to real world policy settings.
↵2 This equation and other equations in this section are simplified examples of the actual equations that weuse; see the Appendix A for the full mortality equation and the other equations that we use.
4 Standard estimates of gaps in healthy life expectancy by current socioeconomic status are substantially larger than our estimate of gaps by childhood circumstance, due to dynamic interdependence between health and social status over the lifecourse.