Abstract
Objective To develop a predictive tool capable of early identification of the risk of acute respiratory failure within 48 hours of hospital admission in patients with community-acquired pneumonia (CAP).
Method A retrospective cohort of 257 CAP patients (median age: 76.0 years, IQR: 68.0–84.0; 56.4% male) was analyzed, among whom 148 (57.6%) developed respiratory failure within 48 hours. From 55 clinical variables, key predictors were selected using LASSO regression. Predictive models were then constructed using multivariable logistic regression (MLR) and machine learning algorithms including XGBoost, LightGBM, and Random Forest. To address the probability calibration issue of the XGBoost model, Platt scaling was applied. A final ensemble model was built by weighted averaging of the calibrated XGBoost, LightGBM, and MLR models. Feature importance was analyzed using SHAP (SHapley Additive exPlanations), and clinical utility was evaluated via decision curve analysis (DCA) and calibration plots.
Result Respiratory rate, TNF-α, IL-1β, heart rate, pleural effusion, and body temperature were identified as the most important predictors. Other key features included total bilirubin, serum calcium, albumin/globulin ratio, and platelet count. The weighted ensemble model outperformed individual models, achieving an AUC of 0.792 on the test set.
Conclusion We developed a predictive tool based on multi-model ensemble learning and interpretable machine learning techniques (SHAP), which provides a basis for early risk stratification and prevention of acute respiratory failure in hospitalized CAP patients.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
The author(s) received no specific funding for this work.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
The study has been approved by the Ethics Committee of Beijing Chaoyang Hospital, affiliated with Capital Medical University. Based on the committee's review, the research uses only anonymized data and does not involve direct human participation. Therefore, the Ethics Committee has waived the requirement for written informed consent.
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
Data Availability
The datasets used and analysed during the current study available from the corresponding author on reasonable request.





