Health Communication Through News Media During the Early Stage of the COVID-19 Outbreak in China: Digital Topic Modeling Approach

J Med Internet Res. 2020 Apr 28;22(4):e19118. doi: 10.2196/19118.

Abstract

Background: In December 2019, a few coronavirus disease (COVID-19) cases were first reported in Wuhan, Hubei, China. Soon after, increasing numbers of cases were detected in other parts of China, eventually leading to a disease outbreak in China. As this dreadful disease spreads rapidly, the mass media has been active in community education on COVID-19 by delivering health information about this novel coronavirus, such as its pathogenesis, spread, prevention, and containment.

Objective: The aim of this study was to collect media reports on COVID-19 and investigate the patterns of media-directed health communications as well as the role of the media in this ongoing COVID-19 crisis in China.

Methods: We adopted the WiseSearch database to extract related news articles about the coronavirus from major press media between January 1, 2020, and February 20, 2020. We then sorted and analyzed the data using Python software and Python package Jieba. We sought a suitable topic number with evidence of the coherence number. We operated latent Dirichlet allocation topic modeling with a suitable topic number and generated corresponding keywords and topic names. We then divided these topics into different themes by plotting them into a 2D plane via multidimensional scaling.

Results: After removing duplications and irrelevant reports, our search identified 7791 relevant news reports. We listed the number of articles published per day. According to the coherence value, we chose 20 as the number of topics and generated the topics' themes and keywords. These topics were categorized into nine main primary themes based on the topic visualization figure. The top three most popular themes were prevention and control procedures, medical treatment and research, and global or local social and economic influences, accounting for 32.57% (n=2538), 16.08% (n=1258), and 11.79% (n=919) of the collected reports, respectively.

Conclusions: Topic modeling of news articles can produce useful information about the significance of mass media for early health communication. Comparing the number of articles for each day and the outbreak development, we noted that mass media news reports in China lagged behind the development of COVID-19. The major themes accounted for around half the content and tended to focus on the larger society rather than on individuals. The COVID-19 crisis has become a worldwide issue, and society has become concerned about donations and support as well as mental health among others. We recommend that future work addresses the mass media's actual impact on readers during the COVID-19 crisis through sentiment analysis of news data.

Keywords: COVID-19; coronavirus; health communication; mass media; outbreak; public crisis; topic modeling.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Betacoronavirus* / pathogenicity
  • COVID-19
  • China / epidemiology
  • Coronavirus Infections / epidemiology*
  • Coronavirus Infections / prevention & control
  • Coronavirus Infections / transmission
  • Coronavirus Infections / virology*
  • Disease Outbreaks* / prevention & control
  • Disease Outbreaks* / statistics & numerical data
  • Health Communication*
  • Humans
  • Mass Media / statistics & numerical data*
  • Mental Health
  • Pandemics / prevention & control
  • Pneumonia, Viral / epidemiology*
  • Pneumonia, Viral / prevention & control
  • Pneumonia, Viral / transmission
  • Pneumonia, Viral / virology*
  • Public Opinion
  • SARS-CoV-2