>>8967702
There's two parts to this, both made to sound much better than they are in practice.
First part is the automatic masking. Cool idea, but you can draw the mask yourself in fifteen seconds and not need a 20-page tutorial. When you look at what was achieved, it only changed one dress for another of a different color but very similar cut and the same shape. If you wanted to give her jeans and a shirt instead, or make her nude, it will sometimes fail to blend with the rest of the image. But yes, we can already do this with regular inpainting, picrel.
The other part is IPAdapter, using a reference image instead of prompting for what you want. In theory IPadapter can replicate anything that a zen prompt master could prompt for, if he knew all the ins-and-outs and little quirks of the checkpoint by heart. Information from the reference image is encoded the same way a text prompt would be. When you look at the dress in her result it's not the same color or pattern, they're just similar. Now, the problem is you're also getting the shading, her specific skin tone, some pose influence, her hair...there's no way to separate just one concept, and all of that excess information makes the blending issue I mentioned much worse. It's better suited for generating whole images than for inpainting. Compare prompts in the top row to references in the bottom row.