Abstract
Motivation We are witnessing an enormous growth in the amount of molecular profiling (-omics) data. The integration of multi-omics data is challenging. Moreover, human multi-omics data may be privacy-sensitive and can be misused to de-anonymize and (re-)identify individuals. Hence, most biomedical data is kept in secure and protected silos. Therefore, it remains a challenge to reuse these data without infringing the privacy of the individuals from which the data were derived. Federated analysis of Findable, Accessible, Interoperable, and Reusable (FAIR) data is a privacy-preserving solution to make optimal use of these multi-omics data and transform them into actionable knowledge.
Results The Netherlands X-omics Initiative is a National Roadmap Large-Scale Research Infrastructure aiming for efficient integration of data generated within X-omics and external datasets. To facilitate this, we developed the FAIR Data Cube (FDCube), which adopts and applies the FAIR principles and helps researchers to create FAIR data and metadata, to facilitate re-use of their data, and to make their data analysis workflows transparent, and in the meantime ensure data security and privacy.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
This work was funded by a Dutch Research Council (NWO) grant to The Netherlands X-omics Initiative (project 184.034.019), a Horizon2020 grant to the European Joint Programme on Rare Diseases (grant agreement Number 825575), a Horizon2020 grant to the EATRIS-Plus project (grant agreement Number 871096), a NWO Open Science Fund (grant agreement number 17703) and a LSH HealthHolland grant to the Trusted World of Corona (TWOC) consortium.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Footnotes
The multi-omics analysis experiment was revised to demonstrate the FDCube capability of facilitating integrated multi-omics data analysis. Specifically to analysis IL-10 protein level measurements and IL10 transcript level measurements on data from COVID-19 patients. During this process, several FAIR resources and pathway information collected from WikiPathways were also used. New figures are generated. The axes and figure captions were more detailed. Related work was revised. New figures are added.
Data Availability
All data produced in the present study are available upon reasonable request to the authors
Abbreviations
- COVID-19
- Coronavirus disease 2019
- DNA
- Deoxyribonucleic Acid
- EMBL-EBI
- EMBL’s European Bioinformatics Institute
- FAIR
- Findable, Accessible, Interoperable, and Reusable
- FDCube
- FAIR Data Cube
- FDP
- FAIR Data Point
- ISA
- Investigation, Study, Assay
- PHT
- Personal Health Train
- RDF
- Resource Description Framework
- RNA
- Ribonucleic Acid
- SPARQL
- SPARQL Protocol and RDF Query Language
- TWOC
- Trusted World of Corona
- IL-10
- Interleukine-10
- OmicsDI
- Omics Discovery Index