Since I posted all those details, I figured I'd have a go a it myself.

  • I used Spleeter on a set of early songs by America (Daisy Jane, God of the Sun, Ventura Highway, To Each His Own), and split them into vocals and backing track.
  • In Audacity, I loaded the isolated vocals, and edited together Gerry Beckley's solo parts for 2:15 of medium/low quality vocals.
  • I used the previously mentioned Google Collab to create a Gerry Beckley vocal model. It took about 25 minutes to run 200 epochs. A high-quality training would be around 1000 epochs, depending on the training data. Audio upload/download times to Google Collab are fairly slow.
  • I saved the resulting model to my Google Drive (making sure the code didn't do anything malicious to my drive first!).
  • I used Spleeter to isolate the vocals and backing track on America's "Tin Man", which is sung by Dewey Bunnell.
  • I used the same Google Collab using the Gerry Beckley model to convert Dewey Bunnell's isolated lead vocals from "Tin Man". The result was as expected from using dirty stems and a low number of epochs. The voice was clearly Gerry Beckley, but with relatively low sound quality and some poorly converted phonemes.


Obviously I can't post the results. wink

I also downloaded Replay. Recall that at heart, Replay is just a wrapper around RVC:

  • Since I don't have supported Nvidia video card, it could only run in CPU mode so it was slow.
  • I needed to watch the video to figure out how to use it. It didn't help that part of the options can't be seen without scrolling the screen.
  • I tried creating a voice model. It took Google Collab 25 minutes to create the low-quality model. In that same time, Replay only got three epochs completed, so I cancelled that.
  • I downloaded the RVC voice model I'd created in Google Collab from Google Drive. It was a .gz file, so it required being decompressed several times using 7Zip. The .pth file contains the model, and is found in the \assets\weights folder.
  • It took about 17 minutes to covert the audio file using the voice model.


So using the Google Collab is an easy way to try out RVC for free. The one I linked to is by no means the only or best one out there, but it is super easy and runs in the Free tier of Google Collab. And if you've got the paid version of Google Collab, you have access to the RVC GUI, where you can see how the training is doing, or restart training on an existing model.

In the other hand, if you've got a supported Nvidia card, it looks pretty easy to run RVC on your machine, especially if you use something pre-packaged like Replay.

I hope this helps!


-- David Cuny
My virtual singer development blog

Vocal control, you say. Never heard of it. Is that some kind of ProTools thing?