RT Journal Article SR Electronic T1 COVID-19 Asymptomatic Infection Estimation JF medRxiv FD Cold Spring Harbor Laboratory Press SP 2020.04.19.20068072 DO 10.1101/2020.04.19.20068072 A1 Yang Yu A1 Yu-Ren Liu A1 Fan-Ming Luo A1 Wei-Wei Tu A1 De-Chuan Zhan A1 Guo Yu A1 Zhi-Hua Zhou YR 2020 UL http://medrxiv.org/content/early/2020/04/23/2020.04.19.20068072.abstract AB Background Mounting evidence suggests that there is an undetected pool of COVID-19 asymptomatic but infectious cases. Estimating the number of asymptomatic infections has been crucial to understand the virus and contain its spread, which is, however, hard to be accurately counted.Methods We propose an approach of machine learning based fine-grained simulator (ML-Sim), which integrates multiple practical factors including disease progress in the incubation period, cross-region population movement, undetected asymptomatic patients, and prevention and containment strength. The interactions among these factors are modeled by virtual transmission dynamics with several undetermined parameters, which are determined from epidemic data by machine learning techniques. When MLSim learns to match the real data closely, it also models the number of asymptomatic patients. MLSim is learned from the open Chinese global epidemic data.Findings MLSim showed better forecast accuracy than the SEIR and LSTM-based prediction models. The MLSim learned from the data of China’s mainland reveals that there could have been 150,408 (142,178-157,417) asymptomatic and had self-healed patients, which is 65% (64% – 65%) of the inferred total infections including undetected ones. The numbers of asymptomatic but infectious patients on April 15, 2020, were inferred as, Italy: 41,387 (29,037 – 57,151), Germany: 21,118 (11,484 – 41,646), USA: 354,657 (277,641 – 495,128), France: 40,379 (10,807 – 186,878), and UK: 144,424 (127,215 – 171,930). To control the virus transmission, the containment measures taken by the government were crucial. The learned MLSim also reveals that if the date of containment measures in China’s mainland was postponed for 1, 3, 5, and 7 days later than Jan. 23, there would be 109,039 (129%), 183,930 (218%), 313,342 (371%), 537,555 (637%) confirmed cases on June 12.Conclusions Machine learning based fine-grained simulators can better model the complex real-world disease transmission process, and thus can help decision-making of balanced containment measures. The simulator also revealed the potential great number of undetected asymptomatic infections, which poses a great risk to the virus containment.Funding National Natural Science Foundation of China.Competing Interest StatementThe authors have declared no competing interest.Funding StatementNational Natural Science Foundation of ChinaAuthor DeclarationsAll relevant ethical guidelines have been followed; any necessary IRB and/or ethics committee approvals have been obtained and details of the IRB/oversight body are included in the manuscript.YesAll necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.Yesall data used in this paper is publically available.