Skip to main content
medRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search

Evaluating Progress in Automatic Chest X-Ray Radiology Report Generation

View ORCID ProfileFeiyang Yu, View ORCID ProfileMark Endo, View ORCID ProfileRayan Krishnan, View ORCID ProfileIan Pan, Andy Tsai, Eduardo Pontes Reis, View ORCID ProfileEduardo Kaiser Ururahy Nunes Fonseca, Henrique Min Ho Lee, View ORCID ProfileZahra Shakeri Hossein Abad, Andrew Y. Ng, View ORCID ProfileCurtis P. Langlotz, View ORCID ProfileVasantha Kumar Venugopal, View ORCID ProfilePranav Rajpurkar
doi: https://doi.org/10.1101/2022.08.30.22279318
Feiyang Yu
1Department of Computer Science, Stanford University, Stanford, United States
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Feiyang Yu
Mark Endo
1Department of Computer Science, Stanford University, Stanford, United States
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Mark Endo
Rayan Krishnan
1Department of Computer Science, Stanford University, Stanford, United States
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Rayan Krishnan
Ian Pan
2Department of Radiology, Brigham and Women’s Hospital, Boston, Massachusetts
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Ian Pan
Andy Tsai
3Department of Radiology, Boston Children’s Hospital, Harvard Medical School, Boston, United States
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Eduardo Pontes Reis
4Hospital Israelita Albert Einstein, São Paulo, Brazil
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Eduardo Kaiser Ururahy Nunes Fonseca
4Hospital Israelita Albert Einstein, São Paulo, Brazil
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Eduardo Kaiser Ururahy Nunes Fonseca
Henrique Min Ho Lee
4Hospital Israelita Albert Einstein, São Paulo, Brazil
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Zahra Shakeri Hossein Abad
5Institute of Health Policy, Management and Evaluation, University of Toronto, Toronto, Canada
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Zahra Shakeri Hossein Abad
Andrew Y. Ng
1Department of Computer Science, Stanford University, Stanford, United States
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Curtis P. Langlotz
6AIMI Center, Stanford University, Stanford, United States
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Curtis P. Langlotz
Vasantha Kumar Venugopal
7CARPL.ai, New Delhi, India
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Vasantha Kumar Venugopal
Pranav Rajpurkar
8Department of Biomedical Informatics, Harvard Medical School, Boston, United States
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Pranav Rajpurkar
  • For correspondence: pranav_rajpurkar@hms.harvard.edu
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Data/Code
  • Preview PDF
Loading

Abstract

The application of AI to medical image interpretation tasks has largely been limited to the identification of a handful of individual pathologies. In contrast, the generation of complete narrative radiology reports more closely matches how radiologists communicate diagnostic information in clinical workflows. Recent progress in artificial intelligence (AI) on vision-language tasks has enabled the possibility of generating high-quality radiology reports from medical images. Automated metrics to evaluate the quality of generated reports attempt to capture overlap in the language or clinical entities between a machine-generated report and a radiologist-generated report. In this study, we quantitatively examine the correlation between automated metrics and the scoring of reports by radiologists. We analyze failure modes of the metrics, namely the types of information the metrics do not capture, to understand when to choose particular metrics and how to interpret metric scores. We propose a composite metric, called RadCliQ, that we find is able to rank the quality of reports similarly to radiologists and better than existing metrics. Lastly, we measure the performance of state-of-the-art report generation approaches using the investigated metrics. We expect that our work can guide both the evaluation and the development of report generation systems that can generate reports from medical images approaching the level of radiologists.

Competing Interest Statement

The Authors declare no Competing Non-Financial Interests but the following Competing Financial Interests: I.P. is a consultant for MD.ai and Diagnosticos da America (Dasa). C.P.L. serves on the board of directors and is a shareholder of Bunkerhill Health. He is an advisor and option holder for GalileoCDS, Sirona Medical, Adra, and Kheiron. He is an advisor to Sixth Street and an option holder in whiterabbit.ai. His research program has received grant or gift support from Carestream, Clairity, GE Healthcare, Google Cloud, IBM, IDEXX, Hospital Israelita Albert Einstein, Kheiron, Lambda, Lunit, Microsoft, Nightingale Open Science, Nines, Philips, Subtle Medical, VinBrain, Whiterabbit.ai, the Paustenbach Fund, the Lowenstein Foundation, and the Gordon and Betty Moore Foundation.

Funding Statement

Support for this work was provided in part by the Medical Imaging Data Resource Center (MIDRC) under contracts 75N92020C00008 and 75N92020C00021 from the National Institute of Biomedical Imaging and Bioengineering (NIBIB) of the National Institutes of Health.

Author Declarations

I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.

Yes

The details of the IRB/oversight body that provided approval or exemption for the research described are given below:

PhysioNet approved the access of the MIMIC-CXR dataset. Please see the access section here for more details: https://physionet.org/content/mimic-cxr-jpg/2.0.0/.

I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.

Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

Yes

I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.

Yes

Data Availability

The data used in the study is available with credentialed access at: https://physionet.org/content/mimic-cxr-jpg/2.0.0/. Credentialed access can be obtained via an application to PhysioNet.

https://physionet.org/content/mimic-cxr-jpg/2.0.0/

Copyright 
The copyright holder for this preprint is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license.
Back to top
PreviousNext
Posted August 31, 2022.
Download PDF
Data/Code
Email

Thank you for your interest in spreading the word about medRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Evaluating Progress in Automatic Chest X-Ray Radiology Report Generation
(Your Name) has forwarded a page to you from medRxiv
(Your Name) thought you would like to see this page from the medRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Evaluating Progress in Automatic Chest X-Ray Radiology Report Generation
Feiyang Yu, Mark Endo, Rayan Krishnan, Ian Pan, Andy Tsai, Eduardo Pontes Reis, Eduardo Kaiser Ururahy Nunes Fonseca, Henrique Min Ho Lee, Zahra Shakeri Hossein Abad, Andrew Y. Ng, Curtis P. Langlotz, Vasantha Kumar Venugopal, Pranav Rajpurkar
medRxiv 2022.08.30.22279318; doi: https://doi.org/10.1101/2022.08.30.22279318
Digg logo Reddit logo Twitter logo Facebook logo Google logo LinkedIn logo Mendeley logo
Citation Tools
Evaluating Progress in Automatic Chest X-Ray Radiology Report Generation
Feiyang Yu, Mark Endo, Rayan Krishnan, Ian Pan, Andy Tsai, Eduardo Pontes Reis, Eduardo Kaiser Ururahy Nunes Fonseca, Henrique Min Ho Lee, Zahra Shakeri Hossein Abad, Andrew Y. Ng, Curtis P. Langlotz, Vasantha Kumar Venugopal, Pranav Rajpurkar
medRxiv 2022.08.30.22279318; doi: https://doi.org/10.1101/2022.08.30.22279318

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Radiology and Imaging
Subject Areas
All Articles
  • Addiction Medicine (215)
  • Allergy and Immunology (495)
  • Anesthesia (106)
  • Cardiovascular Medicine (1093)
  • Dentistry and Oral Medicine (195)
  • Dermatology (141)
  • Emergency Medicine (274)
  • Endocrinology (including Diabetes Mellitus and Metabolic Disease) (499)
  • Epidemiology (9757)
  • Forensic Medicine (5)
  • Gastroenterology (480)
  • Genetic and Genomic Medicine (2303)
  • Geriatric Medicine (221)
  • Health Economics (462)
  • Health Informatics (1553)
  • Health Policy (732)
  • Health Systems and Quality Improvement (602)
  • Hematology (236)
  • HIV/AIDS (501)
  • Infectious Diseases (except HIV/AIDS) (11631)
  • Intensive Care and Critical Care Medicine (616)
  • Medical Education (236)
  • Medical Ethics (67)
  • Nephrology (256)
  • Neurology (2139)
  • Nursing (134)
  • Nutrition (335)
  • Obstetrics and Gynecology (426)
  • Occupational and Environmental Health (517)
  • Oncology (1172)
  • Ophthalmology (363)
  • Orthopedics (128)
  • Otolaryngology (220)
  • Pain Medicine (145)
  • Palliative Medicine (50)
  • Pathology (309)
  • Pediatrics (694)
  • Pharmacology and Therapeutics (298)
  • Primary Care Research (265)
  • Psychiatry and Clinical Psychology (2172)
  • Public and Global Health (4645)
  • Radiology and Imaging (775)
  • Rehabilitation Medicine and Physical Therapy (455)
  • Respiratory Medicine (623)
  • Rheumatology (274)
  • Sexual and Reproductive Health (225)
  • Sports Medicine (208)
  • Surgery (250)
  • Toxicology (43)
  • Transplantation (120)
  • Urology (94)