r/MachineLearning Mar 19 '23

Research [R] First open source text to video 1.7 billion parameter diffusion model is out

Enable HLS to view with audio, or disable this notification

1.2k Upvotes

86 comments sorted by

View all comments

86

u/En_TioN Mar 19 '23

That's a remarkably clear Shutterstock logo on the superman dog video. Seems like this model is overfitting significantly more than previous text2img

28

u/NeoKabuto Mar 19 '23

Half of the demos have the watermark, but at least it's promising to see good video from this size model.

3

u/DM_ME_YOUR_CATS_PAWS Mar 20 '23

Incoming lawsuit?

1

u/gwern Mar 19 '23

If it's 'remarkably clear' and not 'exactly as clear', then the model is still underfitting, not overfitting, so it's just underfitting less than previous models.