TY - JOUR T1 - Health Communication Through News Media During the Early Stage of the COVID-19 Outbreak in China: A Digital Topic Modeling Approach JF - medRxiv DO - 10.1101/2020.03.29.20043547 SP - 2020.03.29.20043547 AU - Qian Liu AU - Zequan Zheng AU - Jiabin Zheng AU - Qiuyi Chen AU - Guan Liu AU - Sihan Chen AU - Bojia Chu AU - Hongyu Zhu AU - Babatunde Akinwunmi AU - Jian Huang AU - Casper J. P. Zhang AU - Wai-kit Ming Y1 - 2020/01/01 UR - http://medrxiv.org/content/early/2020/03/31/2020.03.29.20043547.abstract N2 - Background In December 2019, some COVID-19 cases were first reported and soon the disease broke out. As this dreadful disease spreads rapidly, the mass media has been active in community education on COVID-19 by delivering health information about this novel coronavirus.Methods We adopted the Huike database to extract news articles about coronavirus from major press media, between January 1st, 2020, to February 20th, 2020. The data were sorted and analyzed by Python software and Python package Jieba. We sought a suitable topic number using the coherence number. We operated Latent Dirichlet Allocation (LDA) topic modeling with the suitable topic number and generated corresponding keywords and topic names. We divided these topics into different themes by plotting them into two-dimensional plane via multidimensional scaling.Findings After removing duplicates, 7791 relevant news reports were identified. We listed the number of articles published per day. According to the coherence value, we chose 20 as our number of topics and obtained their names and keywords. These topics were categorized into nine primary themes based on the topic visualization figure. The top three popular themes were prevention and control procedures, medical treatment and research, global/local social/economic influences, accounting for 32·6%, 16·6%, 11·8% of the collected reports respectively.Interpretation The Chinese mass media news reports lag behind the COVID-19 outbreak development. The major themes accounted for around half the content and tended to focus on the larger society than on individuals. The COVID-19 crisis has become a global issue, and society has also become concerned about donation and support as well as mental health. We recommend that future work should address the mass media’s actual impact on readers during the COVID-19 crisis through sentiment analysis of news data.Funding National Social Science Foundation of China (18CXW021)Evidence before this study The novel coronavirus related news reports have engaged public attention in China during the COVID-19 crisis. Topic modeling of these news articles can produce useful information about the significance of mass media for early health communication. We searched the Huike database, the most professional Chinese media content database, using the search term “coronavirus” for related news articles published from January 1st, 2020, to February 20th, 2020. We found that these articles can be classified into different themes according to their emphasis, however, we found no other studies apply topic modeling method to study them.Added value of this study To our knowledge, this study is the first to investigate the patterns of health communications through media and the role the media have played and are still playing in the light of the current COVID-19 crisis in China with topic modeling method. We compared the number of articles each day with the outbreak development and identified there’s a delay in reporting COVID-19 outbreak progression for Chinese mass media. We identify nine main themes for 7791 collected news reports and detail their emphasis respectively.Implications of all the available evidence Our results show that the mass media news reports play a significant role in health communication during the COVID-19 crisis, government can strengthen the report dynamics and enlarge the news coverage next time another disease strikes. Sentiment analysis of news data are needed to assess the actual effect of the news reports.Competing Interest StatementThe authors have declared no competing interest.Funding StatementNational Social Science Foundation of China (18CXW021) supports this study.Author DeclarationsAll relevant ethical guidelines have been followed; any necessary IRB and/or ethics committee approvals have been obtained and details of the IRB/oversight body are included in the manuscript.YesAll necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.YesData are available from existing online repositories that are listed in the manuscript. ER -