Abstract
Purpose The detection of circulating tumor DNA, which allows non-invasive tumor molecular profiling and disease follow-up, promises optimal and individualized management of patients with cancer. However, detecting small fractions of tumor DNA released when the tumor burden is reduced remains a challenge.
Experimental Design We implemented a new highly sensitive strategy to detect base-pair resolution methylation patterns from plasma DNA and assessed the potential of hypomethylation of LINE-1 retrotransposons as a non-invasive multi-cancer detection biomarker. The DIAMOND (Detection of Long Interspersed Nuclear Element Altered Methylation ON plasma DNA) method targets 30-40,000 young L1 scattered throughout the genome, covering about 100,000 CpG sites and is based on a reference-free analysis pipeline.
Results Resulting machine learning-based classifiers showed powerful correct classification rates discriminating healthy and tumor plasmas from 6 types of cancers (colorectal, breast, lung, ovarian, gastric cancers and uveal melanoma including localized stages) in two independent cohorts (AUC = 88% to 100%, N = 747). DIAMOND can also be used to perform copy number alterations (CNA) analysis which improves cancer detection.
Conclusions This should lead to the development of more efficient non-invasive diagnostic tests adapted to all cancer patients, based on the universality of these factors.
Statement of significance The DIAMOND assay is a new highly sensitive strategy to detect base-pair resolution methylation patterns of LINE-1 retrotransposons (L1) from plasma DNA. It targets 30-40,000 young L1 scattered throughout the genome, covering about 100,000 CpG sites and is based on a reference-free analysis pipeline. This provided high coverage data using affordable sequencing depth, which is instrumental to achieve high sensitivity and work with minute amounts of cell-free DNA. Resulting machine learning-based classifiers showed powerful discrimination between healthy and tumor plasmas from 6 types of cancers (colorectal, breast, lung, ovarian, gastric cancers and uveal melanoma including localized stages) in two independent cohorts (AUC = 88% to 100%, N = 747). DIAMOND data can also be used to perform copy number alterations (CNA) analysis which improves cancer detection.
Competing Interest Statement
CP, MM, MH, and CAA have an ongoing patent application relating to circulating tumor DNA analysis.
Funding Statement
The NGS facility was supported by ANR-10-EQPX-03 (Equipex) and ANR-10-INBS-09-08 (France Genomique Consortium) grants and by the Canceropole Ile-de-France. This research was supported by grants, of which C.P. was recipient, from: The European Research Council (ERC-StG EpiDetect), The Ligue contre le cancer (RS17-75-75), The prematuration program of the Centre National pour la Recherche Scientifique (CNRS), The SiRIC 2 Curie program (INCa-DGOS-Inserm_12554) The DEEP Strive funding (LABEX DEEP 11-LBX0044). CAA research was supported in part by the French government under management of Agence Nationale de la Recherche as part of the Investissements d avenir program, reference ANR-19-P3IA-0001 (PRAIRIE 3IA Institute).
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
Healthy white blood cells and healthy plasma were collected from blood of healthy donors through the French blood establishment (agreement #16/EFS/031) under French and European ethical practices. Blood samples from patients treated at the Institut Curie (Paris, France) were collected, after written informed consent, as part of the following studies: resectable metastatic colorectal cancers from the Prodige14 trial (approved by a French Personal Protection Committee: 'Comite de Protection des Personnes' (CPP) and registered in ClinicalTrials.gov under NCT01442935); non-small cell lung cancer and metastatic HR+ HER2-breast cancer from the ALCINA study (approved by a French Personal Protection Committee and registered in ClinicalTrials.gov under NCT02866149; treatment-naive ovarian cancer or triple-negative breast cancer patients eligible for surgery or neoadjuvant chemotherapy from the SCANDARE study (approved by the French National Agency for the Safety of Medicines and Health Products: 'Agence National de Securite du Medicament' (ANSM), a French Personal Protection Committee and registered in ClinicalTrials.gov under NCT03017573); multiple-types of metastatic cancers from the SHIVA02 study (approved by the French National Agency for the Safety of Medicines and Health Products: 'Agence National de Securite du Medicament' (ANSM), a French Personal Protection Committee and registered in ClinicalTrials.gov under NCT03084757), non-metastatic operable gastric cancers and advanced uveal melanoma from CTC-CEC-ADN study (approved by a French Personal Protection Committee and registered in ClinicalTrials.gov under NCT02220556). Additional archived samples were also retrieved from the biobank of the Institut Curie, patients having provided informed consent for research use. All samples were obtained in accordance with the ethical guidelines, with the principles of Good Clinical Practice and the Declaration of Helsinki. This study was approved by the Internal Review Board and Clinical Research Committee of the Institut Curie.
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Footnotes
In this new version, we have now included: – Analysis of plasma and tumor paired samples demonstrating very good correlation using methylation haplotype proportions (new Fig.S3) – Comparison of methylation profiles of hormone dependent and triple negative breast cancers to explain the differences of the M0 and M+ subgroups (new Fig.S4F-H) – The performances per cancer-type and stage from the all cancer model (new Fig.S4D-E and Fig.S8A) – Analysis from publicly available plasma WGBS data which shows that we can retrieve L1PA hypomethylation in cancer compared to healthy from genome wide profiles (new Fig.S5) – Analysis comparing an age-matched cohort and a non-aged-matched cohort extracted from C2 demonstrating identical classification performances in both cohorts (new Fig.S7H-J). – Survival analysis on the validation cohort demonstrating that higher hypomethylation is associated with shorter survival (new Fig.S8D-E) – Controls demonstrating no effect of the cfDNA extraction method employed (new Fig.S10) – Bisulfite and enzymatic conversion comparison (new Fig.S11) – Data for the screening of ovarian cancer mutations (new Table S6) – New data presenting a new integrated model and tests where we exclude cancer subgroups or types from the train set and which are well recognized as cancer by our models to highlight the universality of our approach (new Fig.4G and Fig.S8A-C).
Data Availability
Data have been deposited as methylation matrices (CG% or haplotypes%) on the Zenodo database with the following accession code: https://zenodo.org/uploads/12206227 and as compressed FASTQ files at the European Genome-Phenome Archive at https://ega-archive.org/ under the accession code EGAD50000000646. WGBS sequencing data were downloaded from publicly available database at https://zenodo.org/records/7779198 and from the National Center for Biotechnology Information (https://www.ncbi.nlm.nih.gov) under the accession number PRJNA494975. The code used to analyze the data is available on github: https://github.com/ProudhonLab.