>>106400394
The chunk of video not being altered would be part of the context when it performs the inpainting. Similar to how in image inpainting if you, say, inpaint a hand while ensuring that the other hand is part of the image converted to latent space, it is much more likely to get the nail polish to be consistent in both hands.

If you can do this, then couldn't you, say, inpaint a second at a time to grow the video? Always passing in five seconds of video and inpainting the last second, cutting off a second from the front? And then updating the prompt accordingly to gradually steer the video where you want it to go so it doesn't drift randomly. Seems like it'd be painstaking but I don't know why it wouldn't work.