TY - CHAP
T1 - Overcoming Interpretability in Deep Learning Cancer Classification
AU - Teo, Yue Yang (Alan)
AU - Danilevsky, Artem
AU - Shomron, Noam
N1 - Publisher Copyright:
© 2021, Springer Science+Business Media, LLC, part of Springer Nature.
PY - 2021
Y1 - 2021
N2 - Since its inception, deep learning has revolutionized the field of machine learning and data-driven science. One such data-driven science to be transformed by deep learning is genomics. In the past decade, numerous genomics studies have adopted deep learning and its applications range from predicting regulatory elements to cancer classification. Despite its dominating efficacy in these applications, deep learning is not without drawbacks. A prominent shortcoming of deep learning is the lack of interpretability. Hence, the main objective of this study is to address this obstacle in the deep learning cancer classification. Here we adopt a feature importance scoring methodology (Gradient-based class activation mapping or Grad-CAM) on a quasi-recurrent neural network model that classify cancer based on FASTA sequencing data. In this study, we managed to formulate a nucleotide-to-genomic-region Grad-CAM scoring methodology, as well as, validate the use this methodology for the chosen model. Consequently, this allows for the utilization of the Grad-CAM scoring methodology for feature importance in deep learning cancer classification. The results from our study identify potential novel candidate genes, genomic elements, and mechanisms for future cancer research.
AB - Since its inception, deep learning has revolutionized the field of machine learning and data-driven science. One such data-driven science to be transformed by deep learning is genomics. In the past decade, numerous genomics studies have adopted deep learning and its applications range from predicting regulatory elements to cancer classification. Despite its dominating efficacy in these applications, deep learning is not without drawbacks. A prominent shortcoming of deep learning is the lack of interpretability. Hence, the main objective of this study is to address this obstacle in the deep learning cancer classification. Here we adopt a feature importance scoring methodology (Gradient-based class activation mapping or Grad-CAM) on a quasi-recurrent neural network model that classify cancer based on FASTA sequencing data. In this study, we managed to formulate a nucleotide-to-genomic-region Grad-CAM scoring methodology, as well as, validate the use this methodology for the chosen model. Consequently, this allows for the utilization of the Grad-CAM scoring methodology for feature importance in deep learning cancer classification. The results from our study identify potential novel candidate genes, genomic elements, and mechanisms for future cancer research.
KW - Cancer classification
KW - Deep learning
UR - http://www.scopus.com/inward/record.url?scp=85102051477&partnerID=8YFLogxK
U2 - 10.1007/978-1-0716-1103-6_15
DO - 10.1007/978-1-0716-1103-6_15
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.chapter???
C2 - 33606264
AN - SCOPUS:85102051477
T3 - Methods in Molecular Biology
SP - 297
EP - 309
BT - Methods in Molecular Biology
PB - Humana Press Inc.
ER -