TY - JOUR T1 - Data management in substance use disorder treatment research: Implications from data harmonization of National Institute on Drug Abuse-funded randomized controlled trials JF - medRxiv DO - 10.1101/2020.04.28.20081935 SP - 2020.04.28.20081935 AU - Ryoko Susukida AU - Masoumeh Aminesmaeili AU - Ramin Mojtabai Y1 - 2020/01/01 UR - http://medrxiv.org/content/early/2020/05/03/2020.04.28.20081935.abstract N2 - Background Secondary analysis of data from completed randomized controlled trials (RCTs) is a critical and efficient way to maximize the potential benefit from past research. De-identified primary data from completed RCTs have been increasingly available in recent years; however, the lack of standardized data products is a major barrier to further use of these valuable data. Pre-statistical harmonization of data structure, variables and codebooks across RCTs would facilitate secondary data analysis including meta-analysis and comparative effectiveness studies. We describe a data harmonization initiative to harmonize de-identified primary data from substance use disorder (SUD) treatment RCTs funded by the National Institute on Drug Abuse (NIDA) available on the NIDA Data Share website.Methods Harmonized datasets with standardized data structures, variable names, labels, and definitions and harmonized codebooks were developed for 36 completed RCTs. Common data domains were identified to bundle data files from individual RCTs according to relevant subject areas. Variables within the same instrument were harmonized if at least two RCTs used the same instrument. The structures of the harmonized data were determined based on the feedback from clinical trialists and SUD research experts.Results We have created a harmonized database of variables across 36 RCTs with a build-in label, and a brief definition for each variable. Data files from the RCTs have been consistently categorized into eight domains (enrollment, demographics, adherence, adverse events, physical health measures, mental-behavioral-cognitive health measures, self-reported substance use measures, and biologic substance use measures). Harmonized codebooks and instrument/variable concordance tables have also been developed to help identify instruments and variables of interest more easily.Conclusions The harmonized data of RCTs of SUD treatments can potentially promote future secondary data analysis of completed RCTs, allowing combining data from multiple RCTs and provide guidance for future RCTs in SUD treatment research.Competing Interest StatementDr. Susukida and Dr. Aminesmaeili have nothing to disclose. Dr. Mojtabai report grants from the National Institute on Drug Abuse and National Institute of Mental Health during the conduct of the study. Dr. Mojtabai has received research funding and consulting fees from Bristol-Myers Squibb and Lundbeck Pharmaceuticals.Funding StatementThis project was supported by a research award from Arnold Ventures. The content is solely the responsibility of the authors and does not necessarily represent the official views of Arnold Ventures.Author DeclarationsAll relevant ethical guidelines have been followed; any necessary IRB and/or ethics committee approvals have been obtained and details of the IRB/oversight body are included in the manuscript.YesAll necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.YesThe data referred to in the manuscript are publicly available from the National Institute for Drug Abuse (NIDA) Data Share website. https://datashare.nida.nih.gov/ ER -