Skip to main content
medRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search

Comparison of Population Characteristics in Real-World Clinical Oncology Databases in the US: Flatiron Health, SEER, and NPCR

Xinran Ma, Lura Long, Sharon Moon, Blythe J.S. Adamson, Shrujal S. Baxi
doi: https://doi.org/10.1101/2020.03.16.20037143
Xinran Ma
Flatiron Health, Inc., New York, NY
MS
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Lura Long
Flatiron Health, Inc., New York, NY
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Sharon Moon
Flatiron Health, Inc., New York, NY
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Blythe J.S. Adamson
Flatiron Health, Inc., New York, NY
PhD, MPH
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Shrujal S. Baxi
Flatiron Health, Inc., New York, NY
MD, MPH
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: sbaxi@flatiron.com
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Data/Code
  • Preview PDF
Loading

ABSTRACT

Background and Objective The Surveillance, Epidemiology, and End Results Program (SEER) program and the National Program of Cancer Registries (NPCR), are authoritative sources for population cancer surveillance and research in the US. An increasing number of recent oncology studies are based on the electronic health record (EHR)-derived de-identified databases created and maintained by Flatiron Health. This report describes the differences in the originating sources and data development processes, and compares baseline demographic characteristics in the cancer-specific databases from Flatiron Health, SEER, and NPCR, to facilitate interpretation of research findings based on these sources.

Methods Patients with documented care from January 1, 2011 through May 31, 2019 in a series of EHR-derived Flatiron Health de-identified databases covering multiple tumor types were included. SEER incidence data (obtained from the SEER 18 database) and NPCR incidence data (obtained from the US Cancer Statistics public use database) for malignant cases diagnosed from January 1, 2011 to December 31, 2016 were included. Comparisons of demographic variables were performed across all disease-specific databases, for all patients and for the subset diagnosed with advanced-stage disease.

Results As of May 2019, a total of 201,570 patients with 19 different cancer types were included in Flatiron Health datasets. In an overall comparison to national cancer registries, patients in the Flatiron Health databases had similar sex and geographic distributions, but appeared to be diagnosed with later stages of disease and their age distribution differs from the other datasets. For variables such as stage and race, Flatiron Health databases had a greater degree of incompleteness. There are variations in these trends by cancer types.

Conclusions These three databases present general similarities in demographic and geographic distribution, but there are overarching differences across the populations they cover. Differences in data sourcing (medical oncology EHRs vs cancer registries), and disparities in sampling approaches and rules of data acquisition may explain some of these divergences. Furthermore, unlike the steady information flow entered into registries, the availability of medical oncology EHR-derived information reflects the extent of involvement of medical oncology clinics at different points in the specialty management of individual diseases, resulting in inter-disease variability. These differences should be considered when interpreting study results obtained with these databases.

Competing Interest Statement

All authors are employees of Flatiron Health, Inc., which is an independent subsidiary of the Roche group, and own stock in Roche. S.S.B., L.L. own equity in Flatiron Health.

Clinical Trial

Not applicable

Funding Statement

This study was sponsored by Flatiron Health Inc., which is an independent subsidiary of the Roche group.

Author Declarations

All relevant ethical guidelines have been followed; any necessary IRB and/or ethics committee approvals have been obtained and details of the IRB/oversight body are included in the manuscript.

Yes

All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.

Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

Yes

I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.

Yes

Data Availability

Data supporting the findings of this study have been originated by Flatiron Health, Inc. These de-identified data can be made available upon request, and are subject to a license agreement with Flatiron Health; interested researchers should contact 'DataAccess@flatiron.com' to determine licensing terms

Copyright 
The copyright holder for this preprint is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license.
Back to top
PreviousNext
Posted March 18, 2020.
Download PDF
Data/Code
Email

Thank you for your interest in spreading the word about medRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Comparison of Population Characteristics in Real-World Clinical Oncology Databases in the US: Flatiron Health, SEER, and NPCR
(Your Name) has forwarded a page to you from medRxiv
(Your Name) thought you would like to see this page from the medRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Comparison of Population Characteristics in Real-World Clinical Oncology Databases in the US: Flatiron Health, SEER, and NPCR
Xinran Ma, Lura Long, Sharon Moon, Blythe J.S. Adamson, Shrujal S. Baxi
medRxiv 2020.03.16.20037143; doi: https://doi.org/10.1101/2020.03.16.20037143
Reddit logo Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
Comparison of Population Characteristics in Real-World Clinical Oncology Databases in the US: Flatiron Health, SEER, and NPCR
Xinran Ma, Lura Long, Sharon Moon, Blythe J.S. Adamson, Shrujal S. Baxi
medRxiv 2020.03.16.20037143; doi: https://doi.org/10.1101/2020.03.16.20037143

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Oncology
Subject Areas
All Articles
  • Addiction Medicine (269)
  • Allergy and Immunology (551)
  • Anesthesia (135)
  • Cardiovascular Medicine (1750)
  • Dentistry and Oral Medicine (238)
  • Dermatology (172)
  • Emergency Medicine (311)
  • Endocrinology (including Diabetes Mellitus and Metabolic Disease) (654)
  • Epidemiology (10790)
  • Forensic Medicine (8)
  • Gastroenterology (585)
  • Genetic and Genomic Medicine (2938)
  • Geriatric Medicine (286)
  • Health Economics (532)
  • Health Informatics (1920)
  • Health Policy (833)
  • Health Systems and Quality Improvement (743)
  • Hematology (291)
  • HIV/AIDS (627)
  • Infectious Diseases (except HIV/AIDS) (12503)
  • Intensive Care and Critical Care Medicine (686)
  • Medical Education (299)
  • Medical Ethics (86)
  • Nephrology (324)
  • Neurology (2787)
  • Nursing (150)
  • Nutrition (432)
  • Obstetrics and Gynecology (556)
  • Occupational and Environmental Health (597)
  • Oncology (1458)
  • Ophthalmology (442)
  • Orthopedics (172)
  • Otolaryngology (255)
  • Pain Medicine (190)
  • Palliative Medicine (56)
  • Pathology (380)
  • Pediatrics (865)
  • Pharmacology and Therapeutics (362)
  • Primary Care Research (334)
  • Psychiatry and Clinical Psychology (2635)
  • Public and Global Health (5346)
  • Radiology and Imaging (1004)
  • Rehabilitation Medicine and Physical Therapy (595)
  • Respiratory Medicine (725)
  • Rheumatology (329)
  • Sexual and Reproductive Health (289)
  • Sports Medicine (278)
  • Surgery (327)
  • Toxicology (47)
  • Transplantation (149)
  • Urology (125)