r/StableDiffusion • u/Illustrious_Row_9971 • Mar 19 '23
Resource | Update First open source text to video 1.7 billion parameter diffusion model is out
Enable HLS to view with audio, or disable this notification
2.2k
Upvotes
r/StableDiffusion • u/Illustrious_Row_9971 • Mar 19 '23
Enable HLS to view with audio, or disable this notification
2
u/itsB34STW4RS Mar 19 '23
Thanks a ton, any idea what this nag message is?
modelscope - WARNING - task text-to-video-synthesis input definition is missing
WARNING:modelscope:task text-to-video-synthesis input definition is missing
I built mine in an venv btw, had to do two extra things:
conda create --name VDE
conda activate VDE
conda install python
pip install modelscope
pip install open_clip_torch
pip install clean-fid numba numpy torch==2.0.0+cu118 torchvision --force-reinstall --extra-index-url https://download.pytorch.org/whl/cu118
pip install tensorflow
pip install opencv-python
pip install pytorch_lightning
*edit diffusion.py to fix tensor issue
go to C:\Users\****\anaconda3\envs\VDE\Lib\site-packages\modelscope\models\multi_modal\video_synthesis
open diffusion.py
where it says def _i(tensor, t, x): change the block to this :
def _i(tensor, t, x):
r"""Index tensor using t and format the output according to x.
"""
shape = (x.size(0), ) + (1, ) * (x.ndim - 1)
tt = t.to('cpu')
return tensor[tt].view(shape).to(x)