Search Results
6/18/2025, 8:29:53 PM
>>105633018
>e.g. [["Hello, World!",-0.5]] will reduce the likelihood of all the individual tokens
>all the individual tokens
As long as it doesn't interfere forming multi token words, sure. Hard to notice until you're making it generate that word specifically. It ends up banning "h", "double", "check" on smollm360 and that will change with different tokenizers, so it's not going to be consistent between models.
Picrel are the words the paper suggests banning. Some of those will need variations with commas, periods and/or spaces ("wait" and " wait" are different tokens)
>e.g. [["Hello, World!",-0.5]] will reduce the likelihood of all the individual tokens
>all the individual tokens
As long as it doesn't interfere forming multi token words, sure. Hard to notice until you're making it generate that word specifically. It ends up banning "h", "double", "check" on smollm360 and that will change with different tokenizers, so it's not going to be consistent between models.
Picrel are the words the paper suggests banning. Some of those will need variations with commas, periods and/or spaces ("wait" and " wait" are different tokens)
Page 1