Search Results

Found 1 results for "7fff0d9afce274f1f61b8cc8ad14b316" across all boards searching md5.

Anonymous /g/105866229#105866414
7/11/2025, 5:56:45 AM
Can anyone explain how these tensors work in vae?
For SD 1.5 and SDXL models:
[[1, 4, 160, 106]] for compressed and for output [[1, 1280, 848, 3]].
I fully understand the output; batch size, height, width, color channel.
As for the compressed image; 1 is the batch size and 160 and 106 are compressed resolutions (divided by 8). But I can't make sense of "4". (3 colors plus alpha, maybe?)
It gets even more weird when you have SD3, SD3.5 and Flux with [[1, 16, 160, 106]]. Why did it become 16 now?
The VAEs of video models like hunyuan and WAN have [[1, 16, 1, 160, 106]], I assume the 1 in the middle is frame-number (I am testing images) (though interesting that it still outputs [[1, 1280, 848, 3]]) but I still don't know what 4 and 16 are.
Also do the "4"s in SD 1.5 and SDXL represent the same thing, considering that forcing 1.5 VAE on SDXL doesn't return great results? (Pic related)