TY - JOUR
T1 - Integration of feature vectors from raw laboratory, medication and procedure names improves the precision and recall of models to predict postoperative mortality and acute kidney injury
AU - Hofer, Ira S.
AU - Kupina, Marina
AU - Laddaran, Lori
AU - Halperin, Eran
N1 - Publisher Copyright:
© 2022, The Author(s).
PY - 2022/12
Y1 - 2022/12
N2 - Manuscripts that have successfully used machine learning (ML) to predict a variety of perioperative outcomes often use only a limited number of features selected by a clinician. We hypothesized that techniques leveraging a broad set of features for patient laboratory results, medications, and the surgical procedure name would improve performance as compared to a more limited set of features chosen by clinicians. Feature vectors for laboratory results included 702 features total derived from 39 laboratory tests, medications consisted of a binary flag for 126 commonly used medications, procedure name used the Word2Vec package for create a vector of length 100. Nine models were trained: baseline features, one for each of the three types of data Baseline + Each data type, (all features, and then all features with feature reduction algorithm. Across both outcomes the models that contained all features (model 8) (Mortality ROC-AUC 94.32 ± 1.01, PR-AUC 36.80 ± 5.10 AKI ROC-AUC 92.45 ± 0.64, PR-AUC 76.22 ± 1.95) was superior to models with only subsets of features. Featurization techniques leveraging a broad away of clinical data can improve performance of perioperative prediction models.
AB - Manuscripts that have successfully used machine learning (ML) to predict a variety of perioperative outcomes often use only a limited number of features selected by a clinician. We hypothesized that techniques leveraging a broad set of features for patient laboratory results, medications, and the surgical procedure name would improve performance as compared to a more limited set of features chosen by clinicians. Feature vectors for laboratory results included 702 features total derived from 39 laboratory tests, medications consisted of a binary flag for 126 commonly used medications, procedure name used the Word2Vec package for create a vector of length 100. Nine models were trained: baseline features, one for each of the three types of data Baseline + Each data type, (all features, and then all features with feature reduction algorithm. Across both outcomes the models that contained all features (model 8) (Mortality ROC-AUC 94.32 ± 1.01, PR-AUC 36.80 ± 5.10 AKI ROC-AUC 92.45 ± 0.64, PR-AUC 76.22 ± 1.95) was superior to models with only subsets of features. Featurization techniques leveraging a broad away of clinical data can improve performance of perioperative prediction models.
UR - http://www.scopus.com/inward/record.url?scp=85132419885&partnerID=8YFLogxK
U2 - 10.1038/s41598-022-13879-7
DO - 10.1038/s41598-022-13879-7
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
C2 - 35715454
AN - SCOPUS:85132419885
SN - 2045-2322
VL - 12
JO - Scientific Reports
JF - Scientific Reports
IS - 1
M1 - 10254
ER -