Search Results
8/9/2025, 9:33:57 AM
>Trying to calculate how many samples do I need to conclude that a new embedding/vae/text encoder/etc. makes gens in comparable quality or better than not using it at least 65% of the time (Arbitrary threshold I picked to deem some method/model worthwhile for making slop.).
>With the bog standard 95% confident interval and 5% margin of error I need THREE HUNDRED FUCKING FIFTY sample pairs to conclude that with reasonable confidence.
>Even lowering confidence interval to unusually low 85% and increasing margin of error to sloppy 10%, realistically as much as I can stretch before the whole experiment becomes borderline worthless, I need 48 sample pairs to test any random embedding, vae, prompting method or whatever I see on Civitai.
So how do you test stuff?
I tried just roll with dozen samples and call it a day thing in the past and it actually mislead me into doing some BS that I later realized was in fact not making gens better. That's why I think more rigorous testing is necessary, though that demands more time and effort investment than this 0$/h hobby warrants I think.
I am in a conundrum.
>With the bog standard 95% confident interval and 5% margin of error I need THREE HUNDRED FUCKING FIFTY sample pairs to conclude that with reasonable confidence.
>Even lowering confidence interval to unusually low 85% and increasing margin of error to sloppy 10%, realistically as much as I can stretch before the whole experiment becomes borderline worthless, I need 48 sample pairs to test any random embedding, vae, prompting method or whatever I see on Civitai.
So how do you test stuff?
I tried just roll with dozen samples and call it a day thing in the past and it actually mislead me into doing some BS that I later realized was in fact not making gens better. That's why I think more rigorous testing is necessary, though that demands more time and effort investment than this 0$/h hobby warrants I think.
I am in a conundrum.
Page 1