>>719166773
My GPU is an RTX 5070 Ti.
A 480p gen like that takes about 100 seconds for a 5 second video. 120 seconds for a 6 second video. Lowering the width dimension makes the time even shorter (640x480 takes about 70 seconds to complete).
Right now I am generating 720x832 5 second videos. They take about 175 seconds to complete.
>and does it often take much reworking to get the output to play ball?
It depends on the input image. These are Image2Video generations, and the more complicated the image or text prompt, the less likely you will get a good result.
That being said, the new wan2.2 model was a HUGE upgrade in proompt adherence. In 2.1, it was virtually impossible to get certain things working well (like declothing/stripping). In 2.2 stuff like that is easy.
I still end up generating several (sometimes dozens) of videos until I get a result I like.