>>11409716
thanks. unfortunately the answer is effort. it requires a lot of inpainting in krita on top of a pretty large workflow for the base gens.
currently my base workflow uses the rouwei t5 gemma TE instead of clip. its supposed to have better prompt comprehension but is honestly pretty hit or miss and knows fewer styles overall, so i would still recommend regular clip for most since its much less of a hassle to set up.
for krita inpaints i mainly use noobai checkpoints and some shitmixes when i want to trade style for detail. i just copy over the same positive and negative prompt from my workflow and remove parts of it while inpainting (for example remove feet tags when inpainting hands). i also had to learn to use the paintbrush because it seems raw dogging inpaints cant fix everything. for example i had to manually redraw some of the cars and her entire hind leg to make the pose make sense, then run refining passes to get the style correct.
the cool thing with krita is that you can set the refining to happen at up to 1.5x native resolution so it upscales that part, refines it, then downscales it again so you get way less visible vae artifacts. the other cool thing is that lower refining strength requires fewer steps. i ended up copying the same logic into my hiresfix subgraphs.
for reference here is the base comfyui workflow: https://files.catbox.moe/n3tl1v.webp
and here is the krita file with all the inpaints: https://files.catbox.moe/pryuyg.kra