As soon as we learn how to create a compact representation of audio via dilated convolutions, we can play with these learnings and have fun. You will find very cool demos on the internet:
- For instance, you can see how the model learns the sound of different musical instruments: (https://magenta.tensorflow.org/nsynth)
- Then, you can see how one model learned in one context can be re-mixed in another context. For instance, by changing the speaker identity, we can use WaveNet to say the same thing in different voices (https://deepmind.com/blog/wavenet-generative-model-raw-audio/) .
- Another very interesting experiment is to ...