Skip to main content
medRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search

CausalCellInfer: Resolving cell-type-specific disease mechanisms from biobank-scale GWAS

Liangying Yin, Yujia Shi, Ruoyu Zhang, Yong Xiang, Jinghong Qiu, Pak-Chung Sham, Hon-Cheong So
doi: https://doi.org/10.1101/2024.10.17.24315646
Liangying Yin
1School of Biomedical Sciences, The Chinese University of Hong Kong, Shatin, Hong Kong
2Eric and Wendy Schmidt Center, Broad Institute of MIT and Harvard
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Yujia Shi
1School of Biomedical Sciences, The Chinese University of Hong Kong, Shatin, Hong Kong
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Ruoyu Zhang
1School of Biomedical Sciences, The Chinese University of Hong Kong, Shatin, Hong Kong
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Yong Xiang
1School of Biomedical Sciences, The Chinese University of Hong Kong, Shatin, Hong Kong
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Jinghong Qiu
1School of Biomedical Sciences, The Chinese University of Hong Kong, Shatin, Hong Kong
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Pak-Chung Sham
3Department of Psychiatry, The University of Hong Kong, Hong Kong
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Hon-Cheong So
1School of Biomedical Sciences, The Chinese University of Hong Kong, Shatin, Hong Kong
4KIZ-CUHK Joint Laboratory of Bioresources and Molecular Research of Common Diseases, Kunming Institute of Zoology and The Chinese University of Hong Kong, China
5Department of Psychiatry, The Chinese University of Hong Kong, Hong Kong
6CUHK Shenzhen Research Institute, Shenzhen, China
7Margaret K.L. Cheung Research Centre for Management of Parkinsonism, The Chinese University of Hong Kong, Shatin, Hong Kong
8Brain and Mind Institute, The Chinese University of Hong Kong, Hong Kong SAR, China
9Hong Kong Branch of the Chinese Academy of Sciences Center for Excellence in Animal Evolution and Genetics, The Chinese University of Hong Kong, Hong Kong SAR, China
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: hcso{at}cuhk.edu.hk
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Data/Code
  • Preview PDF
Loading

Abstract

Integrating the cellular resolution of single-cell RNA sequencing (scRNA-seq) with the phenotypic depth of population-scale biobanks is essential for elucidating the cellular basis of complex diseases. However, this integration is often hindered by the limited sample sizes of scRNA-seq cohorts and the lack of cell-type resolution in massive biobank datasets.

We present CausalCellInfer, a scalable computational framework designed to bring cellular resolution to bulk and genotype-imputed transcriptomes. CausalCellInfer utilizes an invariant causal prediction-inspired procedure (scI-GCM) to identify environment-stable marker genes, employs a parsimonious deep neural network for robust cell-fraction deconvolution, and leverages regularized matrix completion to reconstruct cell-type-specific (CTS) expression profiles. This architecture is specifically optimized for biobank-scale data, where technical heterogeneity and limited gene overlap are prevalent.

Validated across simulated data, pseudo-bulk mixtures, and real PBMC datasets, CausalCellInfer demonstrated superior accuracy and computational efficiency compared to existing methods. Applied to ∼500,000 UK Biobank participants, the framework enabled cell-resolved analyses for 29 traits, identifying known pathological shifts, such as reduced pancreatic β-cell proportions in Type 2 Diabetes, and uncovering novel biological signals, including disrupted excitatory neuron and oligodendrocyte interactions in depression. Furthermore, inferred CTS differential expression patterns showed significant concordance with independent single-cell studies and were enriched for OpenTargets disease genes. Overall, CausalCellInfer bridges the gap between single-cell insights and population-scale genomics, providing a powerful tool for systematic discovery of disease mechanisms at cellular resolution.

Competing Interest Statement

The authors have declared no competing interest.

Funding Statement

This work was supported partially by the Lo Kwee Seong Biomedical Research Fund from The Chinese University of Hong Kong and the KIZ-CUHK Joint Laboratory of Bioresources and Molecular Research of Common Diseases, Kunming Institute of Zoology and The Chinese University of Hong Kong, China.

Author Declarations

I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.

Yes

The details of the IRB/oversight body that provided approval or exemption for the research described are given below:

North West - Haydock Research Ethics Committee gave ethical approval for this work

I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.

Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

Yes

I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.

Yes

Footnotes

  • I have updated the introduction, methods, and results sections. Some content in the method section has been moved to the supplementary text. I have deleted one supplementary table.

Data availability

UK biobank data is available to any researchers who formally apply for the data. However, the data is not publicly available due to privacy concerns. Reference scRNA-seq datasets are publicly available at the following sites: frontal cortex: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE144136 adipose tissues: https://singlecell.broadinstitute.org/single_cell/study/SCP1376/a-single-cell-atlas-of-human-and-mouse-white-adipose-tissue?cluster=Human%20WAT&spatialGroups=--&annotation=fat_type--group--study&subsample=100000#study-summary pancreas: https://hpap.pmacs.upenn.edu/

Copyright 
The copyright holder for this preprint is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.
Back to top
PreviousNext
Posted February 14, 2026.
Download PDF

Supplementary Material

Data/Code
Email

Thank you for your interest in spreading the word about medRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
CausalCellInfer: Resolving cell-type-specific disease mechanisms from biobank-scale GWAS
(Your Name) has forwarded a page to you from medRxiv
(Your Name) thought you would like to see this page from the medRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
CausalCellInfer: Resolving cell-type-specific disease mechanisms from biobank-scale GWAS
Liangying Yin, Yujia Shi, Ruoyu Zhang, Yong Xiang, Jinghong Qiu, Pak-Chung Sham, Hon-Cheong So
medRxiv 2024.10.17.24315646; doi: https://doi.org/10.1101/2024.10.17.24315646
Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
CausalCellInfer: Resolving cell-type-specific disease mechanisms from biobank-scale GWAS
Liangying Yin, Yujia Shi, Ruoyu Zhang, Yong Xiang, Jinghong Qiu, Pak-Chung Sham, Hon-Cheong So
medRxiv 2024.10.17.24315646; doi: https://doi.org/10.1101/2024.10.17.24315646

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Genetic and Genomic Medicine
Subject Areas
All Articles
  • Addiction Medicine (576)
  • Allergy and Immunology (868)
  • Anesthesia (306)
  • Cardiovascular Medicine (4483)
  • Dentistry and Oral Medicine (449)
  • Dermatology (385)
  • Emergency Medicine (615)
  • Endocrinology (including Diabetes Mellitus and Metabolic Disease) (1528)
  • Epidemiology (15282)
  • Forensic Medicine (31)
  • Gastroenterology (1134)
  • Genetic and Genomic Medicine (6650)
  • Geriatric Medicine (671)
  • Health Economics (1006)
  • Health Informatics (4606)
  • Health Policy (1378)
  • Health Systems and Quality Improvement (1624)
  • Hematology (544)
  • HIV/AIDS (1276)
  • Infectious Diseases (except HIV/AIDS) (15965)
  • Intensive Care and Critical Care Medicine (1111)
  • Medical Education (626)
  • Medical Ethics (147)
  • Nephrology (674)
  • Neurology (6698)
  • Nursing (346)
  • Nutrition (1006)
  • Obstetrics and Gynecology (1153)
  • Occupational and Environmental Health (961)
  • Oncology (3370)
  • Ophthalmology (988)
  • Orthopedics (370)
  • Otolaryngology (421)
  • Pain Medicine (437)
  • Palliative Medicine (131)
  • Pathology (670)
  • Pediatrics (1704)
  • Pharmacology and Therapeutics (700)
  • Primary Care Research (717)
  • Psychiatry and Clinical Psychology (5497)
  • Public and Global Health (9287)
  • Radiology and Imaging (2225)
  • Rehabilitation Medicine and Physical Therapy (1375)
  • Respiratory Medicine (1202)
  • Rheumatology (598)
  • Sexual and Reproductive Health (721)
  • Sports Medicine (536)
  • Surgery (722)
  • Toxicology (100)
  • Transplantation (290)
  • Urology (267)