>>8865242
It can do it, but it's not perfect, as in it doesn't get it in one shot. Here is prompt
https://files.catbox.moe/exs0d4.png
With some engineering, you can probably get it more consistently, see
https://files.catbox.moe/8q4wgr.png
That is the what I got after I tested prompt from stock image using VLM, and some extra prompt engineering.
Probably higher success with just selfies than dynamic images.
Compared to Flux it's a lot more flexible with gestures though, you can do middle fingers, "OK" sign, middle fingers, "L" hand gesture with some engineering, etc...