TY - GEN
T1 - Deep ranking-based sound source localization
AU - Opochinsky, Renana
AU - Laufer-Goldshtein, Bracha
AU - Gannot, Sharon
AU - Chechik, Gal
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2019/10
Y1 - 2019/10
N2 - Sound source localization is a cumbersome task in challenging reverberation conditions. Recently, there is a growing interest in developing learning-based localization methods. In this approach, acoustic features are extracted from the measured signals and then given as input to a model that maps them to the corresponding source positions. Typically, a massive dataset of labeled samples from known positions is required to train such models.Here, we present a novel weakly-supervised deep-learning localization method that exploits only a few labeled (anchor) samples with known positions, together with a larger set of unlabeled samples, for which we only know their relative physical ordering. We design an architecture that uses a stochastic combination of triplet-ranking loss for the unlabeled samples and physical loss for the anchor samples, to learn a nonlinear deep embedding that maps acoustic features to an azimuth angle of the source. The combined loss can be optimized effectively using standard gradient-based approach.Evaluating the proposed approach on simulated data, we demonstrate its significant improvement over two previous learning-based approaches for various reverberation levels, while maintaining consistent performance with varying sizes of labeled data.
AB - Sound source localization is a cumbersome task in challenging reverberation conditions. Recently, there is a growing interest in developing learning-based localization methods. In this approach, acoustic features are extracted from the measured signals and then given as input to a model that maps them to the corresponding source positions. Typically, a massive dataset of labeled samples from known positions is required to train such models.Here, we present a novel weakly-supervised deep-learning localization method that exploits only a few labeled (anchor) samples with known positions, together with a larger set of unlabeled samples, for which we only know their relative physical ordering. We design an architecture that uses a stochastic combination of triplet-ranking loss for the unlabeled samples and physical loss for the anchor samples, to learn a nonlinear deep embedding that maps acoustic features to an azimuth angle of the source. The combined loss can be optimized effectively using standard gradient-based approach.Evaluating the proposed approach on simulated data, we demonstrate its significant improvement over two previous learning-based approaches for various reverberation levels, while maintaining consistent performance with varying sizes of labeled data.
KW - acoustic source localization
KW - deep embedding learning
KW - relative transfer function
KW - triplet-loss
UR - http://www.scopus.com/inward/record.url?scp=85075709769&partnerID=8YFLogxK
U2 - 10.1109/WASPAA.2019.8937159
DO - 10.1109/WASPAA.2019.8937159
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???
AN - SCOPUS:85075709769
T3 - IEEE Workshop on Applications of Signal Processing to Audio and Acoustics
SP - 283
EP - 287
BT - 2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, WASPAA 2019
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, WASPAA 2019
Y2 - 20 October 2019 through 23 October 2019
ER -