Could this be made? Lectric Panda?

This forum is for discussing Rack Extensions. Devs are all welcome to show off their goods.
Post Reply
User avatar
platzangst
Posts: 728
Joined: 16 Jan 2015

29 Sep 2020

Many years ago, there was an interesting and promising video put out about, well:



To try and condense: The software analyzes a chunk of sound, and breaks it into what we'd call "grains" today. Then, live audio is fed into the software, which compares the incoming signal with its previously analyzed grains, and using those grains, spits out as close an approximation as it can manage to the live signal.

14 years or so on, and this system, as near as I can tell, was never released. But recently I learned of some forays by Sony into creating software that could "deep fake" speech by taking an existing recording of someone talking, and using that to construct entirely new speech in the same voice.

My question/suggestion: could such a thing be done in the realm of Rack Extensions (or VST, I guess)? I'd think by now at the very least hardware has improved since 2006 to make this kind of processing more feasible. Is there such a thing already out there and I just don't know about it? Lectric Panda has done granular-based REs, might this be an interesting project to tackle? Or any other developer, for that matter? Such a thing would certainly have little competition at the moment...

User avatar
miscend
Posts: 1955
Joined: 09 Feb 2015

29 Sep 2020

I've not watched the video yet. But it's probably possible with machine learning. Theres this company that made a tube preamp emulation using machine learning, how the emulation actually works is a complete black box to the developers, they just fed it data and it figured out how to imitate the sound of a mic preamp.

https://www.accentize.com


Edit:
After watching the full vid. It seems you could train AI to analyse a given audio input and match to a database of sounds stored in the cloud somewhere. But I can't see how this would be useful as rack extension. But maybe an app aimed at consumers, some sort of karaoke app where your voice is matched to the sound of famous people.

As for beat boxing DJ stuff at the end of the video. It's not quite the same but there are already some voice to midi products out there.
https://vochlea.com
https://imitone.com

User avatar
Loque
Moderator
Posts: 11173
Joined: 28 Dec 2015

29 Sep 2020

Isnt this just a rex player, playing short parts randomly or by some kind of order? I think a few rex players with some music loops, synced snippets played should do the job. Just the re-construction thing of a source is missing.

A more modern approach is probably the AI thing. There are already things out there, that categorizes samples via AI, so just another thing with beat detection, snippets need to be added and you get a "cloud" with speach, singing, drum hits, guitars and so . Now put in a loop, and a new loop can be created out of it...

Sounds interesting... looking forward to a product...btw, i just take 50% licensing for the idea :-P
Reason12, Win10

PhillipOrdonez
Posts: 3732
Joined: 20 Oct 2017
Location: Norway
Contact:

29 Sep 2020

For fuck sake, the tech to deep fake a voice should never be publicly available. For the little that is good that is left in this shit world, please... gah.

User avatar
Dogcat
Posts: 29
Joined: 21 Sep 2019

29 Sep 2020

PhillipOrdonez wrote:
29 Sep 2020
For fuck sake, the tech to deep fake a voice should never be publicly available. For the little that is good that is left in this shit world, please... gah.
Too late... several deep fake AI engines for voice already exist. Here's just one example: https://vo.codes/

PhillipOrdonez
Posts: 3732
Joined: 20 Oct 2017
Location: Norway
Contact:

29 Sep 2020

Dogcat wrote:
29 Sep 2020
PhillipOrdonez wrote:
29 Sep 2020
For fuck sake, the tech to deep fake a voice should never be publicly available. For the little that is good that is left in this shit world, please... gah.
Too late... several deep fake AI engines for voice already exist. Here's just one example: https://vo.codes/
Fucking everything. 😞

User avatar
bxbrkrz
Posts: 3812
Joined: 17 Jan 2015

29 Sep 2020

757365206C6F67696320746F207365656B20616E73776572732075736520726561736F6E20746F2066696E6420776973646F6D20676574206F7574206F6620796F757220636F6D666F7274207A6F6E65206F7220796F757220696E737069726174696F6E2077696C6C206372797374616C6C697A6520666F7265766572

User avatar
miscend
Posts: 1955
Joined: 09 Feb 2015

03 Oct 2020

bxbrkrz wrote:
29 Sep 2020
Soon you won't know what's real and what isn't.

User avatar
jam-s
Posts: 3035
Joined: 17 Apr 2015
Location: Aachen, Germany
Contact:

03 Oct 2020

miscend wrote:
03 Oct 2020
Soon you won't know what's real and what isn't.
If you think to know what's real now, you might not know what's real now. :twisted:

User avatar
bxbrkrz
Posts: 3812
Joined: 17 Jan 2015

03 Oct 2020

I am watching the show "Raised by wolves". Father is trying hard to make new jokes. Although it is a tool to lighten a dark story, this is something no algorithm can do: creating a funny new joke.
Fake Rogan is funny because the observers laugh at the jokes a human fed to the robot voice.

We are safe, until the Joke/Meme Singularity is a reality.
757365206C6F67696320746F207365656B20616E73776572732075736520726561736F6E20746F2066696E6420776973646F6D20676574206F7574206F6620796F757220636F6D666F7274207A6F6E65206F7220796F757220696E737069726174696F6E2077696C6C206372797374616C6C697A6520666F7265766572

User avatar
teddymcw
Posts: 432
Joined: 13 May 2016

30 Oct 2020

PhillipOrdonez wrote:
29 Sep 2020
Dogcat wrote:
29 Sep 2020


Too late... several deep fake AI engines for voice already exist. Here's just one example: https://vo.codes/
Fucking everything. 😞
I think many would argue it's much much safer for everyone to have access and be educated on what deep learning and all it's capabilities such as deep fakes are capable of. The more distributed the knowledge and tools themselves the less tyrannical they can be.

This will be ubiquitous before very long, thus no need to encapsulate in RE format hah. I can't really comprehend if it'll somehow be possible to write a detection algorithms however between a GAN and human. Mostly pondering because at some point a GAN gets so close to the real thing that in audio file form (at least) it just truly, truly becomes indistinguishable in every quantifiable way from the reality of an original human speaking 'organically' : /

Post Reply
  • Information
  • Who is online

    Users browsing this forum: Mataya and 21 guests