Search Results
6/29/2025, 9:29:18 PM
>>105746664
>There shouldn't be any difference because Q8 GGUF is 8 bit floating point, just different format.
another retarded take from debo
>There shouldn't be any difference because Q8 GGUF is 8 bit floating point, just different format.
another retarded take from debo
6/29/2025, 4:56:19 PM
6/28/2025, 8:24:55 AM
>>105731041
>what would be the best way to run it with minimal (preferably no) quality loss?
Q8 is really close to bf16 and it's only eating ~15gb of memory during inference
>what would be the best way to run it with minimal (preferably no) quality loss?
Q8 is really close to bf16 and it's only eating ~15gb of memory during inference
6/19/2025, 1:42:05 AM
>>105635932
>Because Q8 is negligible quality difference to the full weights while allowing me to generate 8 images at a time easily, speeding up the generation by granting a free 1 image for each 8 time wise
what? Q8 and bf16 have the same speed though?
>Because Q8 is negligible quality difference to the full weights while allowing me to generate 8 images at a time easily, speeding up the generation by granting a free 1 image for each 8 time wise
what? Q8 and bf16 have the same speed though?
6/16/2025, 1:09:26 AM
Page 1