Abstract
The recent wave of biobank repositories linking individual-level genetic data with dense clinical health history has introduced a dramatic paradigm shift in phenotyping for human genetic studies. The mechanism by which biobanks recruit participants can vary dramatically according to factors such as geographic catchment and sampling strategy. These enrollment differences leave an imprint on the cohort, defining the demographics and the utility of the biobank for research purposes. Here we introduce the Michigan Genomics Initiative (MGI), a rolling enrollment, single health system biobank currently consisting of >85,000 participants recruited primarily through surgical encounters at Michigan Medicine. A strong ascertainment effect is introduced by focusing recruitment on individuals in Southeast Michigan undergoing surgery. MGI participants are, on average, less healthy than the general population, which produces a biobank enriched for case counts of many disease outcomes, making it well suited for a disease genetics cohort. A comparison to the much larger UK Biobank, which uses population representative sampling, reveals that MGI has higher prevalence for nearly all diagnosis- code-based phenotypes, and larger absolute numbers of cases for many phenotypes. GWAS of these phenotypes replicate many known findings, validating the genetic and clinical data and their proper linkage. Our results illustrate that single health-system biobanks that recruit participants through opportunistic sampling, such as surgical encounters, produce distinct patient profiles that provide an ideal resource for exploring the genetics of complex diseases.
Competing Interest Statement
Goncalo Abecasis is currently employed by Regeneron Pharmaceuticals. Ellen M Schmidt is currently employed by Serqet Therapeutics. Both contributed to this work while employed at the University of Michigan.
Funding Statement
This study was funded by the Precision Health Initiative of the University of Michigan.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
MGI study participant consent forms and protocols were reviewed and approved by the University of Michigan Medical School Institutional Review Board (IRB IDs HUM00071298, HUM00148297, HUM00099197, HUM00097962, and HUM00106315).
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Data Availability
Individual level genetic and clinical data are not available due to patient privacy. However summary statistics from Genome Wide Association Studies of 1,547 clinical traits are publicly available through an interactive web tool described in the Resources section.