Search Results

Found 1 results for "57912e69fd37e8822fe63712f4f94baf" across all boards searching md5.

8/7/2025, 2:54:08 PM

>unsloth q2-k, offloading about 25 layers to 5090 rest on ddr5, 42k context at q8 on oobabooga
> 1.4 tokens per second after painfully compiling 25k context
>switch to iq4_kks from ubergarm + ik_llama.cpp, bump context to 64k, 20 layers to gpu, same context batch size as oobabooga
>same exact prompt now runs at 4.3 tokens per second after quickly processing context
>will probably get it faster with q3 and playing with command flags
Ik_llama.cpp gods... I kneel...

Go to Thread

Page 1