Science goes on!
Trying to optimize 720p. I'm on a lowly 5060ti with 16GB VRAM.
Same setup each gen, only changed model and blockswaps.
>Q5 GGUF is 12.7GB, takes 25 blockswap to work
>Q8 GGUF is 18.1GB, takes 35 blockswap to work
I see a roughly 30% faster gen with the Q5 and no perceivable difference in output other than that little strappy doohickey hanging from the ceiling moves differently?