Skip to main content
medRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search

Original research: How accurate are digital symptom assessment apps for suggesting conditions and urgency advice?: a clinical vignettes comparison to GPs

Stephen Gilbert, Alicia Mehl, Adel Baluch, Caoimhe Cawley, Jean Challiner, Hamish Fraser, Elizabeth Millen, Jan Multmeier, Fiona Pick, Claudia Richter, Ewelina Türk, Shubhanan Upadhyay, Vishaal Virani, Nicola Vona, Paul Wicks, Claire Novorol
doi: https://doi.org/10.1101/2020.05.07.20093872
Stephen Gilbert
1Ada Health GmbH, Karl-Liebknecht-Str. 1, 10178 Berlin, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: science@ada.com
Alicia Mehl
1Ada Health GmbH, Karl-Liebknecht-Str. 1, 10178 Berlin, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Adel Baluch
1Ada Health GmbH, Karl-Liebknecht-Str. 1, 10178 Berlin, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Caoimhe Cawley
1Ada Health GmbH, Karl-Liebknecht-Str. 1, 10178 Berlin, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Jean Challiner
1Ada Health GmbH, Karl-Liebknecht-Str. 1, 10178 Berlin, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Hamish Fraser
2Brown Center for Biomedical Informatics, Brown University, Box G-R, 233 Richmond Street, Providence, RI 02912, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Elizabeth Millen
1Ada Health GmbH, Karl-Liebknecht-Str. 1, 10178 Berlin, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Jan Multmeier
1Ada Health GmbH, Karl-Liebknecht-Str. 1, 10178 Berlin, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Fiona Pick
1Ada Health GmbH, Karl-Liebknecht-Str. 1, 10178 Berlin, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Claudia Richter
1Ada Health GmbH, Karl-Liebknecht-Str. 1, 10178 Berlin, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Ewelina Türk
1Ada Health GmbH, Karl-Liebknecht-Str. 1, 10178 Berlin, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Shubhanan Upadhyay
1Ada Health GmbH, Karl-Liebknecht-Str. 1, 10178 Berlin, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Vishaal Virani
1Ada Health GmbH, Karl-Liebknecht-Str. 1, 10178 Berlin, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Nicola Vona
1Ada Health GmbH, Karl-Liebknecht-Str. 1, 10178 Berlin, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Paul Wicks
1Ada Health GmbH, Karl-Liebknecht-Str. 1, 10178 Berlin, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Claire Novorol
1Ada Health GmbH, Karl-Liebknecht-Str. 1, 10178 Berlin, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Data/Code
  • Preview PDF
Loading

Abstract

Objectives To compare breadth of condition coverage, accuracy of suggested conditions and appropriateness of urgency advice of 8 popular symptom assessment apps with each other and with 7 General Practitioners.

Design Clinical vignettes study.

Setting 200 clinical vignettes representing real-world scenarios in primary care.

Intervention/comparator Condition coverage, suggested condition accuracy, and urgency advice performance was measured against the vignettes’ gold-standard diagnoses and triage level.

Primary outcome measures Outcomes included (i) proportion of conditions “covered” by an app, i.e. not excluded because the patient was too young/old, pregnant, or comorbid, (ii) proportion of vignettes in which the correct primary diagnosis was amongst the top 3 conditions suggested, and, (iii) proportion of “safe” urgency level advice (i.e. at gold standard level, more conservative, or no more than one level less conservative).

Results Condition-suggestion coverage was highly variable, with some apps not offering a suggestion for many users: in alphabetical order, Ada: 99.0%; Babylon: 51.5%; Buoy: 88.5%; K Health: 74.5%; Mediktor: 80.5%; Symptomate: 61.5%; Your.MD: 64.5%. The top-3 suggestion accuracy (M3) of GPs was on average 82.1±5.2%. For the apps it was – Ada: 70.5%; Babylon: 32.0%; Buoy: 43.0%; K Health: 36.0%; Mediktor: 36.0%; Symptomate: 27.5%; WebMD: 35.5%; Your.MD: 23.5%. Some apps exclude certain user groups (e.g. younger users) or certain conditions - for these apps condition-suggestion performance is generally greater with exclusion of these vignettes. For safe urgency advice, tested GPs had an average of 97.0±2.5%. For the vignettes with advice provided, only three apps had safety performance within 1 S.D. of the GPs (mean) - Ada: 97.0%; Babylon: 95.1%; Symptomate: 97.8%. One app had a safety performance within 2 S.D.s of GPs - Your.MD: 92.6%. Three apps had a safety performance outside 2 S.D.s of GPs - Buoy: 80.0% (p<0.001); K Health: 81.3% (p<0.001); Mediktor: 87.3% (p=1.3×10-3).

Conclusions The utility of digital symptom assessment apps relies upon coverage, accuracy, and safety. While no digital tool outperformed GPs, some came close, and the nature of iterative improvements to software offers scalable improvements to care.

Article Summary Strengths and limitations of this study

Strengths of the study include a large number of vignettes, peer-reviewed by independent and experienced primary care physicians to minimise bias.

Furthermore, GPs and apps were tested with vignettes in a manner that simulates real clinical consultations, based on mock telephone consultations, with detailed source data verification.

Vignette entry was conducted by professionals; a recent study found that laypeople are less good at entering vignettes for symptoms that they have never experienced.

Limitations include the lack of a rigorous and comprehensive selection process to choose the 8 apps and the lack of real patient experience assessment. Because software is constantly evolving, our findings cannot necessarily be generalized in the future. Future replication by independent researchers is needed.

Competing Interest Statement

Some of the authors are employees of/hold equity in the manufacturer of one of the tested apps (Ada Health GmbH). See author affiliations.

Funding Statement

This study was funded by Ada Health GmbH. HF has not received any compensation from Ada Health financial or otherwise.

Author Declarations

All relevant ethical guidelines have been followed; any necessary IRB and/or ethics committee approvals have been obtained and details of the IRB/oversight body are included in the manuscript.

Yes

All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.

Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

Yes

I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.

Yes

Data Availability

All data relevant to the study are included in the article or uploaded as supplementary information with the exception of the case vignettes, which will not be uploaded because they will be used in periodic update of the study analysis (in order to monitor comparatively change in app performance over time). Publication would prevent this important ongoing scientific research. The vignettes will not be disclosed to the Ada medical intelligence team or to other app developers.

Copyright 
The copyright holder for this preprint is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.
Back to top
PreviousNext
Posted May 13, 2020.
Download PDF

Supplementary Material

Data/Code
Email

Thank you for your interest in spreading the word about medRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Original research: How accurate are digital symptom assessment apps for suggesting conditions and urgency advice?: a clinical vignettes comparison to GPs
(Your Name) has forwarded a page to you from medRxiv
(Your Name) thought you would like to see this page from the medRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Original research: How accurate are digital symptom assessment apps for suggesting conditions and urgency advice?: a clinical vignettes comparison to GPs
Stephen Gilbert, Alicia Mehl, Adel Baluch, Caoimhe Cawley, Jean Challiner, Hamish Fraser, Elizabeth Millen, Jan Multmeier, Fiona Pick, Claudia Richter, Ewelina Türk, Shubhanan Upadhyay, Vishaal Virani, Nicola Vona, Paul Wicks, Claire Novorol
medRxiv 2020.05.07.20093872; doi: https://doi.org/10.1101/2020.05.07.20093872
Reddit logo Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
Original research: How accurate are digital symptom assessment apps for suggesting conditions and urgency advice?: a clinical vignettes comparison to GPs
Stephen Gilbert, Alicia Mehl, Adel Baluch, Caoimhe Cawley, Jean Challiner, Hamish Fraser, Elizabeth Millen, Jan Multmeier, Fiona Pick, Claudia Richter, Ewelina Türk, Shubhanan Upadhyay, Vishaal Virani, Nicola Vona, Paul Wicks, Claire Novorol
medRxiv 2020.05.07.20093872; doi: https://doi.org/10.1101/2020.05.07.20093872

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Health Systems and Quality Improvement
Subject Areas
All Articles
  • Addiction Medicine (228)
  • Allergy and Immunology (506)
  • Anesthesia (110)
  • Cardiovascular Medicine (1245)
  • Dentistry and Oral Medicine (206)
  • Dermatology (147)
  • Emergency Medicine (282)
  • Endocrinology (including Diabetes Mellitus and Metabolic Disease) (534)
  • Epidemiology (10032)
  • Forensic Medicine (5)
  • Gastroenterology (500)
  • Genetic and Genomic Medicine (2467)
  • Geriatric Medicine (238)
  • Health Economics (480)
  • Health Informatics (1647)
  • Health Policy (754)
  • Health Systems and Quality Improvement (637)
  • Hematology (250)
  • HIV/AIDS (536)
  • Infectious Diseases (except HIV/AIDS) (11872)
  • Intensive Care and Critical Care Medicine (626)
  • Medical Education (253)
  • Medical Ethics (75)
  • Nephrology (268)
  • Neurology (2290)
  • Nursing (139)
  • Nutrition (352)
  • Obstetrics and Gynecology (454)
  • Occupational and Environmental Health (537)
  • Oncology (1249)
  • Ophthalmology (377)
  • Orthopedics (134)
  • Otolaryngology (226)
  • Pain Medicine (158)
  • Palliative Medicine (50)
  • Pathology (325)
  • Pediatrics (734)
  • Pharmacology and Therapeutics (315)
  • Primary Care Research (282)
  • Psychiatry and Clinical Psychology (2281)
  • Public and Global Health (4844)
  • Radiology and Imaging (843)
  • Rehabilitation Medicine and Physical Therapy (492)
  • Respiratory Medicine (652)
  • Rheumatology (286)
  • Sexual and Reproductive Health (241)
  • Sports Medicine (227)
  • Surgery (269)
  • Toxicology (44)
  • Transplantation (125)
  • Urology (99)