Search - 4rchive

fp8 e4m3n is garbage doodoo caca and should never be used ever under any circumstance. q8_0 all the way

the fact that OP suggests it by default by linking to https://comfyanonymous.github.io/ComfyUI_examples/wan22/ is stinky

the fact that we use fp8 scaled for the WAN model itself at all instead of Q8_0 is stinky. i will make an opinionated guide to t2v shortly

bonus slop: https://rentry.org/QUANTIZATION_ANALYSIS

oh and Q8_0 and the fp8 quants still use the full vram of fp16 since they "unpack" the weights into fp16, read the rentry to understand why