Skip to main content
medRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search

KDClassifier: Urinary Proteomic Spectra Analysis Based on Machine Learning for Classification of Kidney Diseases

Wanjun Zhao, View ORCID ProfileYong Zhang, Xinming Li, Yonghong Mao, Changwei Wu, Lijun Zhao, Fang Liu, Jingqiang Zhu, Jingqiu Cheng, Hao Yang, Guisen Li
doi: https://doi.org/10.1101/2020.12.01.20242198
Wanjun Zhao
1Department of Thyroid Surgery, West China Hospital, Sichuan University, Chengdu 610041, China
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Yong Zhang
2Key Laboratory of Transplant Engineering and Immunology, MOH; West China-Washington Mitochondria and Metabolism Research Center, West China Hospital, Sichuan University, Chengdu 610041, China
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Yong Zhang
Xinming Li
3Computer Science College, Shandong University of Technology, Zibo of Shandong province 255000, China
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Yonghong Mao
4Institute of Thoracic Oncology, West China Hospital, Sichuan University, Chengdu 610041, China
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Changwei Wu
5Renal Department and Institute of Nephrology, Sichuan Provincial People’s Hospital, University of Electronic Science and Technology of China, Sichuan Clinical Research Center for Kidney Diseases, Chengdu 611731, China
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Lijun Zhao
6Division of Nephrology, West China Hospital, Sichuan University, Chengdu, 610041, China
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Fang Liu
6Division of Nephrology, West China Hospital, Sichuan University, Chengdu, 610041, China
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Jingqiang Zhu
1Department of Thyroid Surgery, West China Hospital, Sichuan University, Chengdu 610041, China
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Jingqiu Cheng
2Key Laboratory of Transplant Engineering and Immunology, MOH; West China-Washington Mitochondria and Metabolism Research Center, West China Hospital, Sichuan University, Chengdu 610041, China
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Hao Yang
2Key Laboratory of Transplant Engineering and Immunology, MOH; West China-Washington Mitochondria and Metabolism Research Center, West China Hospital, Sichuan University, Chengdu 610041, China
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: yanghao@scu.edu.cn guisenli@163.com
Guisen Li
5Renal Department and Institute of Nephrology, Sichuan Provincial People’s Hospital, University of Electronic Science and Technology of China, Sichuan Clinical Research Center for Kidney Diseases, Chengdu 611731, China
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: yanghao@scu.edu.cn guisenli@163.com
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Data/Code
  • Preview PDF
Loading

Abstract

Background By extracting the spectrum features from urinary proteomics based on an advanced mass spectrometer and machine learning algorithms, more accurate reporting results can be achieved for disease classification. We attempted to establish a novel diagnosis model of kidney diseases by combining machine learning with an extreme gradient boosting (XGBoost) algorithm with complete mass spectrum information from the urinary proteomics.

Methods We enrolled 134 patients (including those with IgA nephropathy, membranous nephropathy, and diabetic kidney disease) and 68 healthy participants as a control, and for training and validation of the diagnostic model, applied a total of 610,102 mass spectra from their urinary proteomics produced using high-resolution mass spectrometry. We divided the mass spectrum data into a training dataset (80%) and a validation dataset (20%). The training dataset was directly used to create a diagnosis model using XGBoost, random forest (RF), a support vector machine (SVM), and artificial neural networks (ANNs). The diagnostic accuracy was evaluated using a confusion matrix. We also constructed the receiver operating-characteristic, Lorenz, and gain curves to evaluate the diagnosis model.

Results Compared with RF, the SVM, and ANNs, the modified XGBoost model, called a Kidney Disease Classifier (KDClassifier), showed the best performance. The accuracy of the diagnostic XGBoost model was 96.03% (CI = 95.17%-96.77%; Kapa = 0.943; McNemar’s Test, P value = 0.00027). The area under the curve of the XGBoost model was 0.952 (CI = 0.9307-0.9733). The Kolmogorov-Smirnov (KS) value of the Lorenz curve was 0.8514. The Lorenz and gain curves showed the strong robustness of the developed model.

Conclusions This study presents the first XGBoost diagnosis model, i.e., the KDClassifier, combined with complete mass spectrum information from the urinary proteomics for distinguishing different kidney diseases. KDClassifier achieves a high accuracy and robustness, providing a potential tool for the classification of all types of kidney diseases.

Competing Interest Statement

The authors have declared no competing interest.

Funding Statement

This work was funded by grants from the National Natural Science Foundation of China (grant no. 31901038), the China Postdoctoral Science Foundation (2019M653438), the Post-Doctoral Research Foundation of West China Hospital of Sichuan University (2018HXBH062), and the 1.3.5 Project for Disciplines of Excellence, West China Hospital, Sichuan University (ZYGD18014, CJQ).

Author Declarations

I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.

Yes

The details of the IRB/oversight body that provided approval or exemption for the research described are given below:

Diagnosis and pathological examinations of the kidney diseases were conducted at the Department of Nephrology, Sichuan Provincial People's Hospital. Informed consent was obtained from the patients. The study protocol was approved by the Medical Ethics Committee of the Sichuan Provincial People's Hospital and West China Hospital.

All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.

Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

Yes

I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.

Yes

Data Availability

The mass spectra data on the urinary proteomics were deposited into the ProteomeXchange Consortium through the PRIDE partner repository using the dataset identifier PXD018996.

Copyright 
The copyright holder for this preprint is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.
Back to top
PreviousNext
Posted December 02, 2020.
Download PDF
Data/Code
Email

Thank you for your interest in spreading the word about medRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
KDClassifier: Urinary Proteomic Spectra Analysis Based on Machine Learning for Classification of Kidney Diseases
(Your Name) has forwarded a page to you from medRxiv
(Your Name) thought you would like to see this page from the medRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
KDClassifier: Urinary Proteomic Spectra Analysis Based on Machine Learning for Classification of Kidney Diseases
Wanjun Zhao, Yong Zhang, Xinming Li, Yonghong Mao, Changwei Wu, Lijun Zhao, Fang Liu, Jingqiang Zhu, Jingqiu Cheng, Hao Yang, Guisen Li
medRxiv 2020.12.01.20242198; doi: https://doi.org/10.1101/2020.12.01.20242198
Digg logo Reddit logo Twitter logo CiteULike logo Facebook logo Google logo Mendeley logo
Citation Tools
KDClassifier: Urinary Proteomic Spectra Analysis Based on Machine Learning for Classification of Kidney Diseases
Wanjun Zhao, Yong Zhang, Xinming Li, Yonghong Mao, Changwei Wu, Lijun Zhao, Fang Liu, Jingqiang Zhu, Jingqiu Cheng, Hao Yang, Guisen Li
medRxiv 2020.12.01.20242198; doi: https://doi.org/10.1101/2020.12.01.20242198

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Nephrology
Subject Areas
All Articles
  • Addiction Medicine (70)
  • Allergy and Immunology (168)
  • Anesthesia (51)
  • Cardiovascular Medicine (452)
  • Dentistry and Oral Medicine (83)
  • Dermatology (55)
  • Emergency Medicine (158)
  • Endocrinology (including Diabetes Mellitus and Metabolic Disease) (191)
  • Epidemiology (5268)
  • Forensic Medicine (3)
  • Gastroenterology (196)
  • Genetic and Genomic Medicine (759)
  • Geriatric Medicine (80)
  • Health Economics (213)
  • Health Informatics (699)
  • Health Policy (360)
  • Health Systems and Quality Improvement (224)
  • Hematology (99)
  • HIV/AIDS (164)
  • Infectious Diseases (except HIV/AIDS) (5889)
  • Intensive Care and Critical Care Medicine (363)
  • Medical Education (105)
  • Medical Ethics (25)
  • Nephrology (83)
  • Neurology (767)
  • Nursing (43)
  • Nutrition (131)
  • Obstetrics and Gynecology (143)
  • Occupational and Environmental Health (234)
  • Oncology (480)
  • Ophthalmology (152)
  • Orthopedics (39)
  • Otolaryngology (95)
  • Pain Medicine (39)
  • Palliative Medicine (20)
  • Pathology (141)
  • Pediatrics (223)
  • Pharmacology and Therapeutics (136)
  • Primary Care Research (98)
  • Psychiatry and Clinical Psychology (864)
  • Public and Global Health (2021)
  • Radiology and Imaging (349)
  • Rehabilitation Medicine and Physical Therapy (158)
  • Respiratory Medicine (287)
  • Rheumatology (94)
  • Sexual and Reproductive Health (74)
  • Sports Medicine (77)
  • Surgery (110)
  • Toxicology (25)
  • Transplantation (29)
  • Urology (39)