Skip to main content
medRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search

MetaSTAARlite: An all-in-one tool for biobank-scale whole-genome sequencing meta-analysis

Yohhan Kumarasinghe, Jacob Williams, Yuxin Yuan, View ORCID ProfileHaoyu Zhang, View ORCID ProfileZilin Li, View ORCID ProfileXihao Li
doi: https://doi.org/10.1101/2025.06.05.25328973
Yohhan Kumarasinghe
1Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Jacob Williams
2Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Yuxin Yuan
3School of Mathematics and Statistics and KLAS, Northeast Normal University, Changchun, Jilin, China
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Haoyu Zhang
2Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Haoyu Zhang
Zilin Li
3School of Mathematics and Statistics and KLAS, Northeast Normal University, Changchun, Jilin, China
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Zilin Li
  • For correspondence: lizl{at}nenu.edu.cn xihaoli{at}unc.edu
Xihao Li
1Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
4Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Xihao Li
  • For correspondence: lizl{at}nenu.edu.cn xihaoli{at}unc.edu
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Data/Code
  • Preview PDF
Loading

Abstract

Biobank-scale sequencing studies have enabled the analysis of rare variants contributing to complex traits. We introduce MetaSTAARlite, a scalable and resource-efficient summary statistics-based pipeline for functionally-informed rare variant meta-analysis in both the coding and noncoding genome, bypassing the data-sharing restrictions of pooled analysis using individual-level data across multiple biobanks. Using the sequencing data of UK Biobank and the All of Us Research Program, we demonstrate that MetaSTAARlite’s computation time, memory, and storage requirements scale linearly with sample size, while producing results highly concordant with those of a pooled analysis.

Competing Interest Statement

The authors have declared no competing interest.

Funding Statement

This study was supported by the research start-up funds from the Department of Biostatistics at the University of North Carolina at Chapel Hill (Y.K. and X.L.).

Author Declarations

I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.

Yes

The details of the IRB/oversight body that provided approval or exemption for the research described are given below:

UK Biobank: Ethical approval was granted by the National Research Ethics Service Committee North West, Haydock (Reference Number: 11/NW/0382). Detailed information can be found here: https://www.ukbiobank.ac.uk/media/cs1h15s3/rtb-nwrec-application-and-approval-2011.pdf. This RTB approval was granted initially in 2011 and it is renewal on a 5-yearly cycle: hence UK Biobank successfully applied to renew it in 2016 and 2021. All of Us: The study was approved by the Institutional Review Board of the All of Us Research Program. Detailed information can be found here: https://allofus.nih.gov/about/who-we-are/institutional-review-board-irb-of-all-of-us-research-program.

I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.

Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

Yes

I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.

Yes

Data availability

The UK Biobank analyses data were obtained under applications 52008, 91486 and 211447. The functional annotation data are publicly available at the Functional Annotation of Variant-Online Resource (FAVOR)7 site (https://favor.genohub.org) and the FAVOR database (https://doi.org/10.7910/DVN/1VGTJI). All of Us phenotype data and exome data can be accessed through the All of Us research workbench (https://workbench.researchallofus.org), publicly available to registered researchers with controlled tier access.

Copyright 
The copyright holder for this preprint is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.
Back to top
PreviousNext
Posted June 06, 2025.
Download PDF

Supplementary Material

Data/Code
Email

Thank you for your interest in spreading the word about medRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
MetaSTAARlite: An all-in-one tool for biobank-scale whole-genome sequencing meta-analysis
(Your Name) has forwarded a page to you from medRxiv
(Your Name) thought you would like to see this page from the medRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
MetaSTAARlite: An all-in-one tool for biobank-scale whole-genome sequencing meta-analysis
Yohhan Kumarasinghe, Jacob Williams, Yuxin Yuan, Haoyu Zhang, Zilin Li, Xihao Li
medRxiv 2025.06.05.25328973; doi: https://doi.org/10.1101/2025.06.05.25328973
Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
MetaSTAARlite: An all-in-one tool for biobank-scale whole-genome sequencing meta-analysis
Yohhan Kumarasinghe, Jacob Williams, Yuxin Yuan, Haoyu Zhang, Zilin Li, Xihao Li
medRxiv 2025.06.05.25328973; doi: https://doi.org/10.1101/2025.06.05.25328973

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Genetic and Genomic Medicine
Subject Areas
All Articles
  • Addiction Medicine (576)
  • Allergy and Immunology (867)
  • Anesthesia (306)
  • Cardiovascular Medicine (4480)
  • Dentistry and Oral Medicine (449)
  • Dermatology (385)
  • Emergency Medicine (614)
  • Endocrinology (including Diabetes Mellitus and Metabolic Disease) (1528)
  • Epidemiology (15275)
  • Forensic Medicine (31)
  • Gastroenterology (1133)
  • Genetic and Genomic Medicine (6643)
  • Geriatric Medicine (671)
  • Health Economics (1006)
  • Health Informatics (4602)
  • Health Policy (1378)
  • Health Systems and Quality Improvement (1622)
  • Hematology (544)
  • HIV/AIDS (1275)
  • Infectious Diseases (except HIV/AIDS) (15959)
  • Intensive Care and Critical Care Medicine (1110)
  • Medical Education (626)
  • Medical Ethics (147)
  • Nephrology (674)
  • Neurology (6690)
  • Nursing (346)
  • Nutrition (1006)
  • Obstetrics and Gynecology (1152)
  • Occupational and Environmental Health (961)
  • Oncology (3369)
  • Ophthalmology (988)
  • Orthopedics (370)
  • Otolaryngology (421)
  • Pain Medicine (437)
  • Palliative Medicine (131)
  • Pathology (667)
  • Pediatrics (1703)
  • Pharmacology and Therapeutics (699)
  • Primary Care Research (717)
  • Psychiatry and Clinical Psychology (5492)
  • Public and Global Health (9284)
  • Radiology and Imaging (2221)
  • Rehabilitation Medicine and Physical Therapy (1375)
  • Respiratory Medicine (1201)
  • Rheumatology (598)
  • Sexual and Reproductive Health (720)
  • Sports Medicine (535)
  • Surgery (720)
  • Toxicology (100)
  • Transplantation (290)
  • Urology (266)