>>106996499
If the model can't perform with a basic min-p or maybe nsigma (tbd).. temp is just skewing the model probs=old/t there is no concept of temperature in training. If you're interested in temperature try dynamic temp and mod your inference stack to log the params at each sample, maybe to a format you can easily make some graphs of. There's too much woowoo with sampling, get data
>>106996592
Have you done something new or interesting with your llms recently? not cooming silly boy!