In the following steps, we provide a recipe for transferring the voice of one speaker to the recording of another speaker. The code is structured in three parts: Voice Impersonation for CPU (main), a model, and utilities. We will discuss how to run the main and explain what it is doing. Whenever a reference occurs to the other parts of the code, we will provide a high-level explanation of what the referenced method does, but leave the details out for the sake of brevity.
The following code can be found in Voice Impersonation.ipynb:
- Import PyTorch utilities, the neural network model, and math for some basic computations:
import mathfrom torch.autograd import Variablefrom voice_impersonation_utils import *from voice_impersonation_model ...