Blind source separation is addressed, using a novel data-driven approach, based on a well-established probabilistic model. The proposed method is specifically designed for separation of multichannel audio mixtures. The algorithm relies on spectral decomposition of the correlation matrix between different time frames. The probabilistic model implies that the column space of the correlation matrix is spanned by the probabilities of the various speakers across time. The number of speakers is recovered by the eigenvalue decay, and the eigenvectors form a simplex of the speakers' probabilities. Time frames dominated by each of the speakers are identified exploiting convex geometry tools on the recovered simplex. The mixing acoustic channels are estimated utilizing the identified sets of frames, and a linear umixing is performed to extract the individual speakers. The derived simplexes are visually demonstrated for mixtures of two, three, and four speakers. We also conduct a comprehensive experimental study, showing high separation capabilities in various reverberation conditions.
- Blind audio source separation (BASS)
- relative transfer function (RTF)
- spectral decomposition