TY - JOUR
T1 - A practical guide for generating unsupervised, spectrogram-based latent space representations of animal vocalizations
AU - Thomas, Mara
AU - Jensen, Frants H.
AU - Averly, Baptiste
AU - Demartsev, Vlad
AU - Manser, Marta B.
AU - Sainburg, Tim
AU - Roch, Marie A.
AU - Strandburg-Peshkin, Ariana
N1 - Publisher Copyright:
© 2022 The Authors. Journal of Animal Ecology published by John Wiley & Sons Ltd on behalf of British Ecological Society.
PY - 2022/8
Y1 - 2022/8
N2 - Background: The manual detection, analysis and classification of animal vocalizations in acoustic recordings is laborious and requires expert knowledge. Hence, there is a need for objective, generalizable methods that detect underlying patterns in these data, categorize sounds into distinct groups and quantify similarities between them. Among all computational methods that have been proposed to accomplish this, neighbourhood-based dimensionality reduction of spectrograms to produce a latent space representation of calls stands out for its conceptual simplicity and effectiveness. Goal of the study/what was done: Using a dataset of manually annotated meerkat Suricata suricatta vocalizations, we demonstrate how this method can be used to obtain meaningful latent space representations that reflect the established taxonomy of call types. We analyse strengths and weaknesses of the proposed approach, give recommendations for its usage and show application examples, such as the classification of ambiguous calls and the detection of mislabelled calls. What this means: All analyses are accompanied by example code to help researchers realize the potential of this method for the study of animal vocalizations.
AB - Background: The manual detection, analysis and classification of animal vocalizations in acoustic recordings is laborious and requires expert knowledge. Hence, there is a need for objective, generalizable methods that detect underlying patterns in these data, categorize sounds into distinct groups and quantify similarities between them. Among all computational methods that have been proposed to accomplish this, neighbourhood-based dimensionality reduction of spectrograms to produce a latent space representation of calls stands out for its conceptual simplicity and effectiveness. Goal of the study/what was done: Using a dataset of manually annotated meerkat Suricata suricatta vocalizations, we demonstrate how this method can be used to obtain meaningful latent space representations that reflect the established taxonomy of call types. We analyse strengths and weaknesses of the proposed approach, give recommendations for its usage and show application examples, such as the classification of ambiguous calls and the detection of mislabelled calls. What this means: All analyses are accompanied by example code to help researchers realize the potential of this method for the study of animal vocalizations.
KW - UMAP
KW - animal sounds
KW - animal vocalizations
KW - bioacoustics
KW - call classification
KW - dimensionality reduction
KW - spectrogram
KW - unsupervised learning
UR - http://www.scopus.com/inward/record.url?scp=85131521640&partnerID=8YFLogxK
U2 - 10.1111/1365-2656.13754
DO - 10.1111/1365-2656.13754
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
C2 - 35657634
AN - SCOPUS:85131521640
SN - 0021-8790
VL - 91
SP - 1567
EP - 1581
JO - Journal of Animal Ecology
JF - Journal of Animal Ecology
IS - 8
ER -