PT - JOURNAL ARTICLE AU - Leon Lufkin AU - Marko Budišić AU - Sumona Mondal AU - Shantanu Sur TI - A Bayesian Model for Prediction of Rheumatoid Arthritis from Risk Factors AID - 10.1101/2020.07.09.20150326 DP - 2020 Jan 01 TA - medRxiv PG - 2020.07.09.20150326 4099 - http://medrxiv.org/content/early/2020/07/11/2020.07.09.20150326.short 4100 - http://medrxiv.org/content/early/2020/07/11/2020.07.09.20150326.full AB - Rheumatoid arthritis (RA) is a chronic autoimmune disorder that typically manifests as destructive joint inflammation but also affects multiple other organ systems. The pathogenesis of RA is complex where a variety of factors including comorbidities, demographic, and socioeconomic variables are known to influence the incidence and progress of the disease. In this work, we aimed to predict RA from a set of 11 well-known risk factors and their interactions using Bayesian logistic regression. We considered up to third-order interactions between the risk factors and implemented factor analysis of mixed data (FAMD) to account for both the continuous and categorical natures of these variables. The predictive model was further optimized over the area under the receiver operating characteristic curve (AUC) using a genetic algorithm (GA). We use data from the National Health and Nutrition Examination Survey (NHANES). Our optimal predictive model has a smoothed AUC of 0.826 (95% CI: 0.801–0.850) on a validation dataset and 0.805 (95% CI: 0.781–0.829) on a holdout test dataset. Our model identified multiple second- and third-order interactions that demonstrate a strong association with RA, implying the potential role of risk factor interactions in the disease mechanism. Interestingly, we find that the inclusion of higher-order interactions in the model only marginally improves overall predictive ability. Our findings on the contribution of RA risk factors and their interaction on disease prediction could be useful in developing strategies for early diagnosis of RA, thus opening potential avenues for improved patient outcomes and reduced healthcare burden to society.Competing Interest StatementThe authors have declared no competing interest.Funding StatementNo external funding was received for this work.Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:The study was conducted using publicly available datasets from the National Health and Nutrition Examination Survey (NHANES) and no IRB approval was required to use the data.All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.YesDatasets used in this work are publicly available for download from NHANES website https://wwwn.cdc.gov/nchs/nhanes/