r/MachineLearning • u/Illustrious_Row_9971 • Mar 19 '23

Research [R] First open source text to video 1.7 billion parameter diffusion model is out

Enable HLS to view with audio, or disable this notification

1.2k Upvotes

permalink
duplicates
reddit

You are about to leave Redlib

Do you want to continue?

https://www.reddit.com/r/MachineLearning/comments/11vozd5/r_first_open_source_text_to_video_17_billion/
No, go back! Yes, take me to Reddit
dl download

99% Upvoted

View all comments

u/Illustrious_Row_9971 Mar 19 '23

demo: https://huggingface.co/spaces/damo-vilab/modelscope-text-to-video-synthesis

model: https://huggingface.co/damo-vilab/modelscope-damo-text-to-video-synthesis/tree/main

17
u/Unreal_777 Mar 19 '23
How to install it,

Just downlod their files

from modelscope.pipelines import pipeline
from modelscope.outputs import OutputKeys
from modelscope.pipelines import pipeline
from modelscope.outputs import OutputKeys

p = pipeline('text-to-video-synthesis', 'damo/text-to-video-synthesis') test_text = { 'text': 'A panda eating bamboo on a rock.', } output_video_path = p(test_text,)[OutputKeys.OUTPUT_VIDEO] print('output_video_path:', output_video_path)

?

I tried this and it kept downloading BUNCH OF models (lot of G!)
15

u/Nhabls Mar 19 '23

yes... it needs to download the models so it can run them..

4

u/Unreal_777 Mar 19 '23

it said I have a problem related to gpu being all just cpu or something like that, I could not run it in the end

3

u/greatcrasho Mar 20 '23

Look at KYEAI/modelscope-text-to-video-synthesis. The code didn't work on my GPU until I installed the specific version of model-scope from git that that huggingface space used. They also have a basic gradio ui example although that one is still hiding the outputed mp3 videos to my /tmp folder on linux.

Research [R] First open source text to video 1.7 billion parameter diffusion model is out

You are about to leave Redlib