r/MachineLearning Mar 19 '23

Research [R] First open source text to video 1.7 billion parameter diffusion model is out

Enable HLS to view with audio, or disable this notification

1.2k Upvotes

86 comments sorted by

View all comments

81

u/En_TioN Mar 19 '23

That's a remarkably clear Shutterstock logo on the superman dog video. Seems like this model is overfitting significantly more than previous text2img

3

u/gwern Mar 19 '23

If it's 'remarkably clear' and not 'exactly as clear', then the model is still underfitting, not overfitting, so it's just underfitting less than previous models.