>>106323162
>>106323222
I'm doing this. It's easiest if your starting image has the background removed and then you replace it with a greenscreen and do a chroma key effect on the images. Rembg and other background removal tools aren't consistent enough for huge sets of frames, some are gonna be messed up.
Another trick I mentioned in a past post is to create the motion you want, then run YOLO face recognition on all the frames to generate a mask video and run that video + the video with motion through VACE + Multitalk worfklow at like .6-.7 strength to do lip syncing.