Search Results
7/24/2025, 9:42:26 AM
>>716294227
Higher quant typically means better outputs. The downside is that it becomes harder to fit one on your GPU the higher it is. Most of the time you want a size that matches your VRAM if you want it to be as fast as possible.
>>716294391
GPU layers is something you need to experiment with. Kobold can only guess but you can tweak it manually and test to see if you get faster or slower outputs by going up/down. Try going up/down a few steps at a time until you dial in something that seems reasonable. If it's still slow as shit make sure your model fits your GPU.
Higher quant typically means better outputs. The downside is that it becomes harder to fit one on your GPU the higher it is. Most of the time you want a size that matches your VRAM if you want it to be as fast as possible.
>>716294391
GPU layers is something you need to experiment with. Kobold can only guess but you can tweak it manually and test to see if you get faster or slower outputs by going up/down. Try going up/down a few steps at a time until you dial in something that seems reasonable. If it's still slow as shit make sure your model fits your GPU.
Page 1