TY - JOUR T1 - Comparing methods to predict baseline mortality for excess mortality calculations – unravelling ‘the German puzzle’ and its implications for spline-regression JF - medRxiv DO - 10.1101/2022.07.18.22277746 SP - 2022.07.18.22277746 AU - Tamás Ferenci Y1 - 2022/01/01 UR - http://medrxiv.org/content/early/2022/07/19/2022.07.18.22277746.abstract N2 - Introduction The World Health Organization presented global excess mortality estimates for 2020 and 2021 on May 5, 2022, almost immediately stirring controversy, one point of which was the suspiciously high estimate for Germany. Later analysis revealed that the reason of this – in addition to a data preparation issue – was the nature of the spline-model underlying WHO’s method used for excess mortality estimation. This paper aims to reproduce the problem using synthetic datasets, thus allowing the investigation of its sensitivity to parameters, both of the mortality curve and of the used method, thereby shedding light on the conditions that gave rise to this error and its possible remedies.Material and Methods A negative binomial model was used with constant overdispersion, and a mean being composed of three terms: a long-term change (modelled with a quadratic trend), deterministic seasonality, modelled with a single harmonic term and random additional peaks during the winter (flu season) and during the summer (heat waves). Simulated mortality curves from this model were then analyzed with naive methods (simple mean of the latest few years, simple linear trend projection from the latest few years), with the WHO’s method and with the method of Acosta and Irizarry. Four years of forecasting was compared with actual data and mean squared error (MSE), mean absolute percentage error (MAPE) and bias were calculated. Parameters of the simulation and parameters of the methods were varied.Results Capturing only these three characteristics of the mortality time series allowed the robust reproduction of the phenomenon underlying WHO’s results. Using 2015 as the starting year – as in the WHO’s study – results in very poor performance for the WHO’s method, clearly revealing the problem as even simple linear extrapolation was better. However, the Acosta-Irizarry method substantially outperformed WHO’s method despite being also based on splines. In certain – but not all – scenarios, errors were substantially affected by the parameters of the mortality curve, but the ordering of the methods was very stable. Results were highly dependent of the parameters of the estimation procedure: even WHO’s method produced much better results if the starting year was earlier, or if the basis dimension was lower. Conversely, Acosta-Irizarry method can generate poor forecasts if the number of knots is increased. Linear extrapolation could produce very good results, but is highly dependent on the choice of the starting year, while average was the worst in almost all cases.Discussion and conclusion WHO’s method is highly dependent on the choice of parameters and is almost always dominated by the Acosta-Irizarry method. Linear extrapolation could be better, but it is highly dependent on the choice of the starting year; in contrast, Acosta-Irizarry method exhibits a relatively stable performance irrespectively of the starting year. Using the average method is almost always the worst except for very special circumstances. This proves that splines are not inherently unsuitable for predicting baseline mortality, but care should be taken, in particular, these results suggest that the key issue is that the structure of the splines should be rigid. No matter what approach or parametrization is used, model diagnostics must be performed before accepting the results, and used methods should be preferably validated with extensive simulations on synthetic datasets. Further research is warranted to understand how these results can be generalized to other scenarios.Competing Interest StatementThe authors have declared no competing interest.Funding StatementThis study did not receive any funding.Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesI confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.YesAll data produced are available online at https://github.com/tamas-ferenci/MortalityPrediction. https://github.com/tamas-ferenci/MortalityPrediction ER -