Abstract
Diagnosis is a key step of patient management. During decades, refined decision algorithms and numerical scores based on conventional statistic tools were elaborated to ensure optimal reliability. Recently, a number of machine learning tools were developed and applied to process more and more extensive data sets, including up to million of items and yielding sophisticated classification models. While this approach met with impressive efficiency in some cases, practical limitations stem from the high number of parameters that may be required by a model, resulting in increased cost and delay of decision making. Also, information relative to the specificity of local recruitment may be lost, hampering any simplification of universal models. Here, we explored the capacity of currently available artificial intelligence tools to classify patients found in a single health center on the basis of a limited number of parameters. As a model, the discrimination between systemic lupus erythematosus (SLE) and mixed connective tissue disease (MCTD) on the basis of thirteen biological parameters was studied with eight widely used classifiers. It is concluded that classification performance may be significantly improved by a knowledge-based selection of discriminating parameters.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
The authors declare no no competing interest, we only benefited from institutional support and our institutions did not receive any specific funding concerning the submitted work
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
Ethics committee of Marseille Hospitals (Assistance Publique - Hopitaux de Marseille) gave ethical approval for this work
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Data Availability
All data produced in the present work are contained in the manuscript.