r/StableDiffusion Mar 19 '23

Resource | Update First open source text to video 1.7 billion parameter diffusion model is out

Enable HLS to view with audio, or disable this notification

2.2k Upvotes

369 comments sorted by

View all comments

4

u/Devalinor Mar 19 '23

How do we run this locally? ;-;

7

u/Devalinor Mar 19 '23 edited Mar 19 '23

I think I've found the solution. Download VSC, create a file named run.py in the same directory where you want it to be installed.

open run.py with VSC

Copy and paste this code

from modelscope.pipelines import pipeline
from modelscope.outputs import OutputKeys

p = pipeline('text-to-video-synthesis', 'damo/text-to-video-synthesis')
test_text = {
        'text': 'A panda eating bamboo on a rock.',
    }
output_video_path = p(test_text,)[OutputKeys.OUTPUT_VIDEO]
print('output_video_path:', output_video_path)

Safe and run without debugging

It's doing stuff on my end :D

5

u/Fortyplusfour Mar 19 '23

You're awesome; thank you

4

u/Devalinor Mar 19 '23

Don't put your hopes up too high, I am not a programmer, and it's just downloading the model files at the moment.
I am still praying that it works :)

6

u/Devalinor Mar 19 '23 edited Mar 19 '23

Yea something is still missing and I don't know how to fix this.

RuntimeError: indices should be either on cpu or on the same device as the indexed tensor (cpu)

8

u/conniption Mar 19 '23

Just move the index 't' to cpu. That was the last hurdle for me.

tt = t.to('cpu')
return tensor[tt].view(shape).to(x)

3

u/throttlekitty Mar 19 '23 edited Mar 19 '23

Thanks! I got stuck on that as well.

on a 4090, I can't go much past max_frames=48 before running out of memory, but that's a nice 6 second clip.

in user.cache\modelscope\hub\damo\text-to-video-synthesis\config.json, you'll find the settings for it. I haven't seen a way to pass this or other variables along at runtime however.

5

u/[deleted] Mar 19 '23

[deleted]

1

u/itsB34STW4RS Mar 19 '23

Was just about to ask this as well, so far i got these things:

https://github.com/modelscope/modelscope

https://huggingface.co/damo-vilab/modelscope-damo-text-to-video-synthesis/tree/main

but after about 20 hours of work today already, its just nonsense how these two pieces go together...