Retrospective Machine Learning Approach for Forecasting In-Hospital Death in ICU Patients After Cardiac Arrest

Abstract

Accurate identification of patients at high risk of in-hospital mortality in intensive care units (ICUs) is vital for enhancing clinical decision-making and improving patient care strategies. As traditional statistical models often fall short in modeling nonlinear and multifactorial clinical variables, this study explores a machine learning (ML) approach to overcome these limitations.

Our research performed a retrospective study using the MIMIC-IV database, focusing on 2,385 ICU patients who met predefined eligibility criteria. Numerical features were summarized through statistical aggregations (maximum, minimum, mean), while categorical attributes underwent structured encoding. The dataset was split into 70% for training and 30% for validation. We applied a combination of regularization techniques (LASSO, Ridge, ElasticNet) and Random Forest-based importance ranking for feature selection.

Multiple supervised ML algorithms, including CatBoost, XGBoost, and Support Vector Machines, were benchmarked using metrics such as AUC-ROC, calibration plots, and decision curve analysis. SHAP values were employed to enhance model explainability.

The CatBoost algorithm achieved the most favorable results with AUC scores of 0.904 and 0.868 on the training and test sets. These findings suggest that the proposed model offers a reliable, interpretable, and potentially integrable solution for ICU mortality risk prediction.

Competing Interest Statement

The authors have declared no competing interest.

Funding Statement

The author(s) received no specific funding for this work.

Author Declarations

I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.

Yes

The details of the IRB/oversight body that provided approval or exemption for the research described are given below:

This study was conducted using the MIMIC-IV database, a de-identified and publicly available dataset. Ethical approval was not required. Access was granted under the credentialed data use agreement (Certification Number: 50778029). Patient consent was waived as the dataset used is fully de-identified and publicly available.

I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.

Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

Yes

I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.

Yes

Footnotes

Emails: Yong Si: Email: yongsiusc.edu

Li Sun: Email: lsun4765usc.edu

Shuheng Chen: Email: shuhengcusc.edu

JunYi Fan: Email: junyifanusc.edu

Kamiar Alaei: Email: kamiar.alaeicsulb.edu

Elham Pishgar: Email: dr.elhampishgargmail.com

Greg Placencia: Emial: gvplacenciacpp.edu

Maryam Pishgar: Email: pishgarusc.edu

View original article

Medrxiv - Cardiovascular Medicine Medrxiv