Dynamically Sacrificing Accuracy for Reduced Computation: Cascaded Inference Based on Softmax Confidence

Konstantin Berestizshevsky, Guy Even

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

We study the tradeoff between computational effort and classification accuracy in a cascade of deep neural networks. During inference, the user sets the acceptable accuracy degradation which then automatically determines confidence thresholds for the intermediate classifiers. As soon as the confidence threshold is met, inference terminates immediately without having to compute the output of the complete network. Confidence levels are derived directly from the softmax outputs of intermediate classifiers, as we do not train special decision functions. We show that using a softmax output as a confidence measure in a cascade of deep neural networks leads to a reduction of 15 % –50 % in the number of MAC operations while degrading the classification accuracy by roughly 1 %. Our method can be easily incorporated into pre-trained non-cascaded architectures, as we exemplify on ResNet. Our main contribution is a method that dynamically adjusts the tradeoff between accuracy and computation without retraining the model.

Original languageEnglish
Title of host publicationArtificial Neural Networks and Machine Learning – ICANN 2019
Subtitle of host publicationDeep Learning - 28th International Conference on Artificial Neural Networks, Proceedings
EditorsIgor V. Tetko, Pavel Karpov, Fabian Theis, Vera Kurková
PublisherSpringer Verlag
Pages306-320
Number of pages15
ISBN (Print)9783030304836
DOIs
StatePublished - 2019
Event28th International Conference on Artificial Neural Networks, ICANN 2019 - Munich, Germany
Duration: 17 Sep 201919 Sep 2019

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11728 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference28th International Conference on Artificial Neural Networks, ICANN 2019
Country/TerritoryGermany
CityMunich
Period17/09/1919/09/19

Keywords

  • Deep learning
  • Efficient inference
  • Neural networks

Fingerprint

Dive into the research topics of 'Dynamically Sacrificing Accuracy for Reduced Computation: Cascaded Inference Based on Softmax Confidence'. Together they form a unique fingerprint.

Cite this