Document Type
Article
Publication Date
2019
Abstract
On a yearly basis, sepsis costs US hospitals more than any other health condition. A majority of patients who suffer from sepsis are not diagnosed at the time of admission. Early detection and antibiotic treatment of sepsis are vital to improve outcomes for these patients, as each hour of delayed treatment is associated with increased mortality. In this study our goal is to predict sepsis 12 hours before its diagnosis using vitals and blood tests routinely taken in the ICU. We have investigated the performance of several machine learning algorithms including XGBoost, CNN, CNN-LSTM and CNN-XGBoost. Contrary to our expectations, XGBoost outperforms all of the sequential models and yields the best hour-by-hour prediction, perhaps due to the way we imputed missing values, losing signal that relates to the time-series nature of the EHR data. We added feature engineering to detect change points in tests and vitals, resulting in 5% improvement in XGBoost. Our team, USF-Sepsis-Phys, achieved a utility score of 0.22 (untuned threshold) and an average of the three reported AUCs (test sets A, B, C) of 0.82. As expected with this AUC, the same model with tuned threshold (not run in the PhysioNet challenge) performed significantly better, as evaluated with 3-fold cross-validation of the entire PhyisoNet training set.
DOI
10.22489/CinC.2019.411
Recommended Citation
Sarafrazi, Soodabeh; Choudhari, Rohini; Mehta, Chiral; Mehta, Himanshi; Japalaghi, Omid K.; Han, Jie; Mehta, Kinjal A.; Han, Hyunyoung; and Francis-Lyon, Patricia A., "Cracking the “Sepsis” Code: Assessing Time Series Nature of EHR data, and Using Deep Learning for Early Sepsis Prediction" (2019). Nursing and Health Professions Faculty Research and Publications. 160.
https://repository.usfca.edu/nursing_fac/160
Comments
Originally published online by Computing in Cardiology 2019, Vol 46