SepIt: Approaching a Single Channel Speech Separation Bound

Shahar Lutati, Eliya Nachmani, Lior Wolf

Research output: Contribution to journalConference articlepeer-review

8 Scopus citations

Abstract

We present an upper bound for the Single Channel Speech Separation task, which is based on an assumption regarding the nature of short segments of speech. Using the bound, we are able to show that while the recent methods have made great progress for a few speakers, there is room for improvement for five and ten speakers. We then introduce a Deep neural network, SepIt, that iteratively improves the different speakers' estimation. At test time, SpeIt has a varying number of iterations per test sample, based on a mutual information criterion that arises from our analysis. In an extensive set of experiments, SepIt outperforms the state of the art neural networks for 2, 3, 5, and 10 speakers.

Original languageEnglish
Pages (from-to)5323-5327
Number of pages5
JournalProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Volume2022-September
DOIs
StatePublished - 2022
Event23rd Annual Conference of the International Speech Communication Association, INTERSPEECH 2022 - Incheon, Korea, Republic of
Duration: 18 Sep 202222 Sep 2022

Funding

FundersFunder number
European Research Council
Horizon 2020ERC CoG 725974

    Keywords

    • deep learning
    • single channel
    • speech separation

    Fingerprint

    Dive into the research topics of 'SepIt: Approaching a Single Channel Speech Separation Bound'. Together they form a unique fingerprint.

    Cite this