TY - JOUR
T1 - Comparison of deep learning models for natural language processing-based classification of non-English head CT reports
AU - Barash, Yiftach
AU - Guralnik, Gennadiy
AU - Tau, Noam
AU - Soffer, Shelly
AU - Levy, Tal
AU - Shimon, Orit
AU - Zimlichman, Eyal
AU - Konen, Eli
AU - Klang, Eyal
N1 - Publisher Copyright:
© 2020, Springer-Verlag GmbH Germany, part of Springer Nature.
PY - 2020/10/1
Y1 - 2020/10/1
N2 - Purpose: Natural language processing (NLP) can be used for automatic flagging of radiology reports. We assessed deep learning models for classifying non-English head CT reports. Methods: We retrospectively collected head CT reports (2011–2018). Reports were signed in Hebrew. Emergency department (ED) reports of adult patients from January to February for each year (2013–2018) were manually labeled. All other reports were used to pre-train an embedding layer. We explored two use cases: (1) general labeling use case, in which reports were labeled as normal vs. pathological; (2) specific labeling use case, in which reports were labeled as with and without intra-cranial hemorrhage. We tested long short-term memory (LSTM) and LSTM-attention (LSTM-ATN) networks for classifying reports. We also evaluated the improvement of adding Word2Vec word embedding. Deep learning models were compared with a bag-of-words (BOW) model. Results: We retrieved 176,988 head CT reports for pre-training. We manually labeled 7784 reports as normal (46.3%) or pathological (53.7%), and 7.1% with intra-cranial hemorrhage. For the general labeling, LSTM-ATN-Word2Vec showed the best results (AUC = 0.967 ± 0.006, accuracy 90.8% ± 0.01). For the specific labeling, all methods showed similar accuracies between 95.0 and 95.9%. Both LSTM-ATN-Word2Vec and BOW had the highest AUC (0.970). Conclusion: For a general use case, word embedding using a large cohort of non-English head CT reports and ATN improves NLP performance. For a more specific task, BOW and deep learning showed similar results. Models should be explored and tailored to the NLP task.
AB - Purpose: Natural language processing (NLP) can be used for automatic flagging of radiology reports. We assessed deep learning models for classifying non-English head CT reports. Methods: We retrospectively collected head CT reports (2011–2018). Reports were signed in Hebrew. Emergency department (ED) reports of adult patients from January to February for each year (2013–2018) were manually labeled. All other reports were used to pre-train an embedding layer. We explored two use cases: (1) general labeling use case, in which reports were labeled as normal vs. pathological; (2) specific labeling use case, in which reports were labeled as with and without intra-cranial hemorrhage. We tested long short-term memory (LSTM) and LSTM-attention (LSTM-ATN) networks for classifying reports. We also evaluated the improvement of adding Word2Vec word embedding. Deep learning models were compared with a bag-of-words (BOW) model. Results: We retrieved 176,988 head CT reports for pre-training. We manually labeled 7784 reports as normal (46.3%) or pathological (53.7%), and 7.1% with intra-cranial hemorrhage. For the general labeling, LSTM-ATN-Word2Vec showed the best results (AUC = 0.967 ± 0.006, accuracy 90.8% ± 0.01). For the specific labeling, all methods showed similar accuracies between 95.0 and 95.9%. Both LSTM-ATN-Word2Vec and BOW had the highest AUC (0.970). Conclusion: For a general use case, word embedding using a large cohort of non-English head CT reports and ATN improves NLP performance. For a more specific task, BOW and deep learning showed similar results. Models should be explored and tailored to the NLP task.
KW - Attention
KW - Deep learning
KW - Emergency service, hospital
KW - Natural language processing
KW - Tomography, X-ray computed
UR - http://www.scopus.com/inward/record.url?scp=85084135493&partnerID=8YFLogxK
U2 - 10.1007/s00234-020-02420-0
DO - 10.1007/s00234-020-02420-0
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
C2 - 32335686
AN - SCOPUS:85084135493
SN - 0028-3940
VL - 62
SP - 1247
EP - 1256
JO - Neuroradiology
JF - Neuroradiology
IS - 10
ER -