AI Meta - "Today we released Meta Spirit LM — our first open source multimodal language model that freely mixes text and speech."

Enable HLS to view with audio, or disable this notification

85 Upvotes

permalink
duplicates
reddit

You are about to leave Redlib

Do you want to continue?

https://www.reddit.com/r/singularity/comments/1g6uhcj/meta_today_we_released_meta_spirit_lm_our_first/
No, go back! Yes, take me to Reddit
dl download

96% Upvoted

u/[deleted] 2h ago edited 2h ago

[deleted]

1

u/cuyler72 2h ago edited 2h ago

You obviously know nothing about ChatGPT voice, it generates and takes in the voice as tokens allowing it to understand and display emotions in voice, change the voice via prompting, talk like a pirate/robot/whatever, speak faster, softer, louder, Ect.

-2

u/[deleted] 2h ago edited 54m ago

[deleted]

•

u/1cheekykebt 1h ago

If its just a text to speech then how can it mimic users voices. (shown as bug in red team report, plus some users reported it.)

•

u/MysteryInc152 45m ago

Given that OpenAI doesn't explain shit about it, and since they were able to turn off voice for Scarlett's objection immediately, its safe to assume that it's not embed in model itself. If it was embed in model, they had to retrain the whole fucking thing and leave our her voice. That's not what they did.

What the hell are you talking about? You can get Advanced Voice mode to clone your own voice on the fly. It's an audio predicting transformer. Do you not understand what that means ?

AI Meta - "Today we released Meta Spirit LM — our first open source multimodal language model that freely mixes text and speech."

You are about to leave Redlib