TY - JOUR T1 - The second wave of SARS-CoV-2 infections and COVID-19 deaths in Germany – driven by values, social status and migration background? A county-scale explainable machine learning approach JF - medRxiv DO - 10.1101/2021.04.14.21255474 SP - 2021.04.14.21255474 AU - Gabriele Doblhammer AU - Constantin Reinke AU - Daniel Kreft Y1 - 2021/01/01 UR - http://medrxiv.org/content/early/2021/04/19/2021.04.14.21255474.abstract N2 - There is a general consensus that SARS-CoV-2 infections and COVID-19 deaths have hit lower social groups the hardest, however, for Germany individual level information on socioeconomic characteristics of infections and deaths does not exist. The aim of this study was to identify the key features explaining SARS-CoV-2 infections and COVID-19 deaths during the upswing of the second wave in Germany.We considered information on COVID-19 diagnoses and deaths from 1. October to 15. December 2020 on the county-level, differentiating five two-week time periods. We used 155 indicators to characterize counties in nine geographic, social, demographic, and health domains. For each period, we calculated directly age-standardized COVID-19 incidence and death rates on the county level. We trained gradient boosting models to predict the incidence and death rates with the 155 characteristics of the counties for each period. To explore the importance and the direction of the correlation of the regional indicators we used the SHAP procedure. We categorized the top 20 associations identified by the Shapley values into twelve categories depicting the correlation between the feature and the outcome.We found that counties with low SES were important drivers in the second wave, as were those with high international migration and a high proportion of foreigners and a large nursing home population. During the period of intense exponential increase in infections, the proportion of the population that voted for the Alternative for Germany (AfD) party in the last federal election was among the top characteristics correlated with high incidence and death rates.We concluded that risky working conditions with reduced opportunities for social distancing and a high chronic disease burden put populations in low-SES counties at higher risk of SARS-CoV-2 infections and COVID-19 deaths. In addition, noncompliance with Corona measures and spill-over effects from neighbouring counties increased the spread of the virus. To further substantiate this finding, we urgently need more data at the individual level.Competing Interest StatementThe authors have declared no competing interest.Clinical TrialNo clinical trialFunding StatementNo external funding was received.Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:No ethics approval was required because of the use of aggregate data.All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.YesThe following datasets were derived from sources in the public domain: Robert Koch Institute, ESRI. RKI COVID19. dl-de/by-2-0. https://npgeo-corona-npgeo-de.hub.arcgis.com/datasets. Statistical Offices of the Federation and the Laender. Regional database. https://www.regionalstatistik.de/genesis DESTATIS Census 2011: Census database. https://ergebnisse.zensus2011.de INKAR Database: Federal Institute for Research on Building, Urban Affairs, and Spatial Devel-opment. INKAR - Indikatoren und Karten zur Raum- und Stadtentwicklung 2020. https://www.inkar.de/ European Center for Disease Prevention and Control: Download historical data (to 14 De-cember 2020) on the daily number of new reported COVID-19 cases and deaths world-wide (https://www.ecdc.europa.eu/en/publications-data/download-todays-data-geographic-distribution-covid-19-cases-worldwide) The following data are available on request from the data holder: Emission data: German Environment Agency Database (UAB): https://www.umweltbundesamt.de/en https://npgeo-corona-npgeo-de.hub.arcgis.com/datasets. https://www.regionalstatistik.de/genesis https://ergebnisse.zensus2011.de https://www.inkar.de/ https://www.ecdc.europa.eu/en/publications-data/download-todays-data-geographic-distribution-covid-19-cases-worldwide https://www.umweltbundesamt.de/en ER -