RT Journal Article SR Electronic T1 Using Machine-Learning Techniques to Identify Responders vs. Non-responders in Randomized Clinical Trials JF medRxiv FD Cold Spring Harbor Laboratory Press SP 2020.11.21.20232041 DO 10.1101/2020.11.21.20232041 A1 Vasiliki Nikolodimou A1 Paul Agapow YR 2020 UL http://medrxiv.org/content/early/2020/11/23/2020.11.21.20232041.abstract AB Despite the expectation of heterogeneity in therapy outcomes, especially for complex diseases like cancer, analyzing differential response to experimental therapies in a randomized clinical trial (RCT) setting is typically done by dividing patients into responders and non-responders, usually based on a single endpoint. Given the existence of biological and patho-physiological differences among metastatic colorectal cancer (mCRC) patients, we hypothesized that a data-driven analysis of an RCT population outcomes can identify sub-types of patients founded on differential response to Panitumumab - a fully human monoclonal antibody directed against the epidermal growth factor receptor.Outcome and response data of the RCT population were mined with heuristic, distance-based and model-based unsupervised clustering algorithms. The population sub-groups obtained by the best performing clustering approach were then examined in terms of molecular and clinical characteristics. The utility of this characterization was compared against that of the sub-groups obtained by the conventional responders analysis and then contrasted with aetiological evidence around mCRC heterogeneity and biological functioning.The Partition around Medoids clustering method results into the identification of seven sub-types of patients, statistically distinct from each other in survival outcomes, prognostic biomarkers and genetic characteristics. Conventional responders analysis was proven inferior in uncovering relationships between physical, clinical history, genetic attributes and differential treatment resistance mechanisms.Combined with improved characterization of the molecular subtypes of CRC, applying Machine Learning techniques, like unsupervised clustering, onto the wealth of data already collected by previous RCTs can support the design of further targeted, more efficient RCTs and better identification of patient groups who will respond to a given intervention.Competing Interest StatementThe authors have declared no competing interest.Clinical TrialNCT00113763Funding StatementNot applicable.Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:N/AAll necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.YesThe data used for the conduct of this research is available through DataSphere portal. https://www.datasphere.online/en/