Longer videos and textual inversions and fp16 autocast#25
Longer videos and textual inversions and fp16 autocast#25dajes wants to merge 4 commits intoguoyww:mainfrom
Conversation
|
Thank you dajes for your contribution! I've tested the fp16 autocast on my 4090 and the speed increases are x4. From 55s per gif I went down to 15s/gif. So 200% increase for me |
|
Any chance you would be interested in figuring out how to add embeddings or the context stride to this repo? https://github.com/neggles/animatediff-cli |
actually I was able to get it to work, nevermind! |
|
Just a reminder, for users looking at this who have an old Maxwell card like the Tesla M40 , the fp16 mode actually causes a 3x slowdown instead of 3x speed boost, so use fp32 for Maxwell cards. Maxwell doesn't have dedicated fp16 hardware. Found this out the hard way haha. |
|
Is this PR merged any where? |
AFAIK this technique is used in https://github.com/neggles/animatediff-cli https://github.com/magic-research/magic-animate |
Added the ability to choose separately the length of a video and the size of the context of the temporal attention module. By using a sliding window of attention it is now possible to generate infinitely long GIFs.
Sliding window related parameters:
--L- the length of the generated animation.--context_length- the length of the sliding window (limited by motion modules capacity), default toL.--context_overlap- how much neighbouring contexts overlap. By defaultcontext_length/ 2--context_stride- (2^context_stride) is a max stride between 2 neighbour frames. By default 0Added support for
.pttextual inversions from civit.ai that should be put inmodels/embeddingsdirectory. Though I'm not very sure if this implementation is fully correct, but works fine for me.Now inference automatically uses
torch.autocastto fp16 if--fp32is not specified. It sped things up by 100% in my tests.