Halfbakery: Autotune for Lyrics

Please log in.

Before you can vote, you need to register. Please log in or create an account.

Culture: Alternate Soundtracks
Autotune for Lyrics (+6) [vote for, against]
Guided generative text, fewer mondegreens

Autotune allows people who can't sing to perform vocals. This idea goes a step further, allowing people who struggle to articulate a sentence to sound coherent.

It uses existing generative text software, combined with existing audio deepfake technology, so that, when you mumble something into the microphone, well-formed words come out of the speakers, in your voice and in tune.

This would make possible a new game, where contestants would try to sing prompts of which the software could make no sense.
-- pertinax, Oct 22 2023

How do you know this is not already in general use? If it’s good you could never tell. Watch the lips?

MacOS has a function that allows you to record your voice and use it for entered text. After a bit of training, it’s uncanny. Combine with existing Chat AI and you are baked, but not at the speed that would allow public speaking. That’s a minor speed bump, though. Soon come.
-- minoradjustments, Oct 22 2023

[a1] I found my recorded voice very accurate but totally enervated. I like it. I may record some responses to the disinterested Sri Lankan who picks up the Help Line phone. I’m going to try it out on people who know me, and wait for the accusations of drug use.

What would happen if you layered the Speak Text output with autotune? Do you get Milli Vanilli again?
-- minoradjustments, Oct 22 2023

I think there should be a feature which either curtails a lisp or replaces the "s" sounding words, if that's not homophobic.
-- 4and20, Oct 22 2023

//Do you get Milli Vanilli again?// If Mill Vanilli fall over in a forest, does someone else make a sound?
-- hippo, Oct 22 2023

[4and20] What would you replace the ‘s’ words with? Who are we being kind to here? I’ll help but I don’t know who this helps.

[hippo] Yes, I hear cheering and James Brown.
-- minoradjustments, Oct 25 2023

//MacOS has a function that allows you to record your voice and use it for entered text.//

Huh, very tempting to use that for remote powerpoint presentations. If I'm tempted, it's already happening. Forewarned is forearmed, I'll have to think of ways to check that who's presenting is actually presenting.
-- bs0u0155, Oct 25 2023

//you'll. know.//

Change your real speaking voice to match.. or just wait. The simulation will get better and all the time exposure to real speaking voices declines. Already, huge chunks of youtube are generated voices. TikTok takes real voices and chops them up to remove gaps and changes speed. Podcast apps do the same thing. What fraction of speech that young people hear is real vs manipulated?
-- bs0u0155, Oct 25 2023

Once that MacOS ability is trained initially, it listens and refines as you use it. As I understand it the work is done in the cloud IRT, but the data resides on your device and is not dependent upon a cloud connection to operate once socialized, and there’s no data retained in the cloud. Anyone who knows me could tell that it is a canned voice construct but strangers might worry for my health yet accept the voice as real. Pretty good as to timbre and pitch, gravelly-ness and noise, but bad at pacing and expression.
-- minoradjustments, Oct 25 2023

random, halfbakery