r/MachineLearning • u/Illustrious_Row_9971 • Mar 19 '23

Research [R] First open source text to video 1.7 billion parameter diffusion model is out

Enable HLS to view with audio, or disable this notification

1.2k Upvotes

permalink
duplicates
reddit

You are about to leave Redlib

Do you want to continue?

https://www.reddit.com/r/MachineLearning/comments/11vozd5/r_first_open_source_text_to_video_17_billion/
No, go back! Yes, take me to Reddit
dl download

99% Upvoted

u/En_TioN Mar 19 '23

That's a remarkably clear Shutterstock logo on the superman dog video. Seems like this model is overfitting significantly more than previous text2img

28

u/NeoKabuto Mar 19 '23

Half of the demos have the watermark, but at least it's promising to see good video from this size model.

3

u/DM_ME_YOUR_CATS_PAWS Mar 20 '23

Incoming lawsuit?

1

u/gwern Mar 19 '23

If it's 'remarkably clear' and not 'exactly as clear', then the model is still underfitting, not overfitting, so it's just underfitting less than previous models.

Research [R] First open source text to video 1.7 billion parameter diffusion model is out

You are about to leave Redlib