We present a method for translating music across musical instruments and styles. This method is based on unsupervised training of a multi-domain wavenet autoencoder, with a shared encoder and a domain-independent latent space that is trained end-to-end on waveforms. Employing a diverse training dataset and large net capacity, the single encoder allows us to translate also from musical domains that were not seen during training. We evaluate our method on a dataset collected from professional musicians, and achieve convincing translations. We also study the properties of the obtained translation and demonstrate translating even from a whistle, potentially enabling the creation of instrumental music by untrained humans.
|State||Published - 2019|
|Event||7th International Conference on Learning Representations, ICLR 2019 - New Orleans, United States|
Duration: 6 May 2019 → 9 May 2019
|Conference||7th International Conference on Learning Representations, ICLR 2019|
|Period||6/05/19 → 9/05/19|