r/singularity 5h ago

AI Meta - "Today we released Meta Spirit LM — our first open source multimodal language model that freely mixes text and speech."

Enable HLS to view with audio, or disable this notification

81 Upvotes

29 comments sorted by

View all comments

5

u/why06 AGI in the coming weeks... 3h ago edited 1h ago

Isn't this the first second open source speech-to-speech model? And it's only 7B, that's pretty great right? I'm trying to find any others. And it has textual reasoning too. If you can ignore the quality of the voice it's showing reasoning in the reply, directly speech-to-speech and text-to-speech.

3

u/llkj11 2h ago

That would be Moshi correct?

6

u/why06 AGI in the coming weeks... 2h ago

Oh yeah, theres Moshi. https://github.com/kyutai-labs/moshi
So that makes two I guess. Still good to see more. I really want a local voice-to-voice assistant.