Anonymous
7/23/2025, 2:38:52 PM
No.105998317
>>105997931
>They can't because there is no way to quantify what "good" is for RP.
I would start with output variety, determinism and maybe checking if some key words are there. Even if you don't know if it is good at least let me reroll wildly different things with some basic coherence check. I think the problem is that when majority of training is finding a single correct answer to a problem you will never get a real good RP model.
>They can't because there is no way to quantify what "good" is for RP.
I would start with output variety, determinism and maybe checking if some key words are there. Even if you don't know if it is good at least let me reroll wildly different things with some basic coherence check. I think the problem is that when majority of training is finding a single correct answer to a problem you will never get a real good RP model.