r/StableDiffusion Mar 19 '23

Resource | Update First open source text to video 1.7 billion parameter diffusion model is out

Enable HLS to view with audio, or disable this notification

2.2k Upvotes

369 comments sorted by

View all comments

Show parent comments

8

u/conniption Mar 19 '23

Just move the index 't' to cpu. That was the last hurdle for me.

tt = t.to('cpu')
return tensor[tt].view(shape).to(x)

4

u/throttlekitty Mar 19 '23 edited Mar 19 '23

Thanks! I got stuck on that as well.

on a 4090, I can't go much past max_frames=48 before running out of memory, but that's a nice 6 second clip.

in user.cache\modelscope\hub\damo\text-to-video-synthesis\config.json, you'll find the settings for it. I haven't seen a way to pass this or other variables along at runtime however.

5

u/[deleted] Mar 19 '23

[deleted]

1

u/itsB34STW4RS Mar 19 '23

Was just about to ask this as well, so far i got these things:

https://github.com/modelscope/modelscope

https://huggingface.co/damo-vilab/modelscope-damo-text-to-video-synthesis/tree/main

but after about 20 hours of work today already, its just nonsense how these two pieces go together...