Anonymous
7/8/2025, 12:32:41 AM
No.8654028
>>8654010
Textgen is pretty easy. You only need KoboldCPP, they have precompiled executables, and a model. Once you have a model you like you look for the GGUF files on HF and download a quant according to https://huggingface.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
As for settings, eh, there's a lot of bullshit around but with how fried these new models are you only need 1 temp, 1 rep pen penalty, top k 35, and top p 0.90.
Tavern hooks up to it and there.
Although if you want pure chat with no weird memories etc you can also use LMStudio, it's a CPP frontend but it's sleek and normiefriendly.
Textgen is pretty easy. You only need KoboldCPP, they have precompiled executables, and a model. Once you have a model you like you look for the GGUF files on HF and download a quant according to https://huggingface.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
As for settings, eh, there's a lot of bullshit around but with how fried these new models are you only need 1 temp, 1 rep pen penalty, top k 35, and top p 0.90.
Tavern hooks up to it and there.
Although if you want pure chat with no weird memories etc you can also use LMStudio, it's a CPP frontend but it's sleek and normiefriendly.