Skip to main content
medRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search

Putative breast cancer risk variants from populations of South Asian ancestry are under-represented in public variant classification databases

Raveen Rony, Shenglong Deng, Sarah Yang, Ken Doig, David L Goode
doi: https://doi.org/10.1101/2025.02.13.25322245
Raveen Rony
1Peter MacCallum Cancer Centre, Melbourne VIC Australia
2University of Melbourne, Melbourne VIC Australia
3University of Huddersfield, United Kingdom
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Shenglong Deng
2University of Melbourne, Melbourne VIC Australia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Sarah Yang
1Peter MacCallum Cancer Centre, Melbourne VIC Australia
4Monash University, Clayton VIC Australia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Ken Doig
1Peter MacCallum Cancer Centre, Melbourne VIC Australia
2University of Melbourne, Melbourne VIC Australia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
David L Goode
1Peter MacCallum Cancer Centre, Melbourne VIC Australia
2University of Melbourne, Melbourne VIC Australia
4Monash University, Clayton VIC Australia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: david.goode{at}monash.edu
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Data/Code
  • Preview PDF
Loading

Abstract

The majority of publicly available genomics data originates from populations of European ancestry. This limits understanding and detection of inherited genetic risk factors for breast cancer in other populations. To assess the extent to which deficits in knowledge of the genetics of breast cancer risk exist for populations of non-European ancestry, we compared data available on putative breast cancer risk variants in the ClinVar database for populations of different ancestry.

Protein-coding insertions and deletions (indels) and single-nucleotide polymorphisms (SNPs) private to populations of Non-Finnish European (NFE), African (AFR), Admixed American (AMR), East Asian (EAS) and South Asian (SAS) ancestry from the Genome Aggregation Consortium (gnomAD v4) were identified for nine established breast cancer risk genes. The percentage of private protein-coding variants listed as ‘Unreported’ by gnomAD in ClinVar were compared between populations.

The SAS population had the biggest knowledge deficit, as 43.4% of private SAS variants were not reported in ClinVar, compared to 20-30% for other populations. Proportionally fewer SAS variants were reported for all 9 genes, with the difference reaching an adjusted p < 0.05 for PALB2, ATM and BRCA2 when compared to NFE. In contrast, few genes had significantly lower ClinVar reporting rates for AFR, AMR and EAS than for NFE.

ClinVar reporting deficits in the SAS population were observed for both missense and protein-truncating variants. Unreported variants were usually very rare and largely absent in other public repositories. A substantial fraction of unreported variants were protein-truncating (17.2%), or missense with high predicted pathogenicity scores, representing novel candidate breast cancer risk alleles.

Our work demonstrates putative breast cancer risk variants from populations of South Asian ancestry are less likely to be reported in ClinVar. Defining and removing barriers to reporting potential risk variants for breast cancer from South Asian populations is needed to reduce this knowledge deficit.

Competing Interest Statement

The authors have declared no competing interest.

Funding Statement

This study was funded by the Victorian Cancer Agency, the Peter MacCallum Cancer Foundation and the Laby Foundation.

Author Declarations

I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.

Yes

The details of the IRB/oversight body that provided approval or exemption for the research described are given below:

Source data were openly available before the initiation of the study from the gnomAD database (https://gnomad.broadinstitute.org/)

I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.

Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

Yes

I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.

Yes

Data Availability

Code and data files used in these analyses are available as Git repo at https://github.com/RaveenRony/ClinVar-annotation-rates

https://github.com/RaveenRony/ClinVar-annotation-rates

Copyright 
The copyright holder for this preprint is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.
Back to top
PreviousNext
Posted February 14, 2025.
Download PDF

Supplementary Material

Data/Code
Email

Thank you for your interest in spreading the word about medRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Putative breast cancer risk variants from populations of South Asian ancestry are under-represented in public variant classification databases
(Your Name) has forwarded a page to you from medRxiv
(Your Name) thought you would like to see this page from the medRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Putative breast cancer risk variants from populations of South Asian ancestry are under-represented in public variant classification databases
Raveen Rony, Shenglong Deng, Sarah Yang, Ken Doig, David L Goode
medRxiv 2025.02.13.25322245; doi: https://doi.org/10.1101/2025.02.13.25322245
Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
Putative breast cancer risk variants from populations of South Asian ancestry are under-represented in public variant classification databases
Raveen Rony, Shenglong Deng, Sarah Yang, Ken Doig, David L Goode
medRxiv 2025.02.13.25322245; doi: https://doi.org/10.1101/2025.02.13.25322245

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Genetic and Genomic Medicine
Subject Areas
All Articles
  • Addiction Medicine (576)
  • Allergy and Immunology (868)
  • Anesthesia (306)
  • Cardiovascular Medicine (4482)
  • Dentistry and Oral Medicine (449)
  • Dermatology (385)
  • Emergency Medicine (615)
  • Endocrinology (including Diabetes Mellitus and Metabolic Disease) (1528)
  • Epidemiology (15278)
  • Forensic Medicine (31)
  • Gastroenterology (1133)
  • Genetic and Genomic Medicine (6645)
  • Geriatric Medicine (671)
  • Health Economics (1006)
  • Health Informatics (4605)
  • Health Policy (1378)
  • Health Systems and Quality Improvement (1623)
  • Hematology (544)
  • HIV/AIDS (1276)
  • Infectious Diseases (except HIV/AIDS) (15961)
  • Intensive Care and Critical Care Medicine (1111)
  • Medical Education (626)
  • Medical Ethics (147)
  • Nephrology (674)
  • Neurology (6695)
  • Nursing (346)
  • Nutrition (1006)
  • Obstetrics and Gynecology (1153)
  • Occupational and Environmental Health (961)
  • Oncology (3369)
  • Ophthalmology (988)
  • Orthopedics (370)
  • Otolaryngology (421)
  • Pain Medicine (437)
  • Palliative Medicine (131)
  • Pathology (669)
  • Pediatrics (1704)
  • Pharmacology and Therapeutics (700)
  • Primary Care Research (717)
  • Psychiatry and Clinical Psychology (5495)
  • Public and Global Health (9285)
  • Radiology and Imaging (2223)
  • Rehabilitation Medicine and Physical Therapy (1375)
  • Respiratory Medicine (1201)
  • Rheumatology (598)
  • Sexual and Reproductive Health (721)
  • Sports Medicine (535)
  • Surgery (722)
  • Toxicology (100)
  • Transplantation (290)
  • Urology (267)