Search Results

Found 2 results for "0a62c398a911a66a989ade3b0e817903" across all boards searching md5.

Anonymous /g/105909674#105910874
7/15/2025, 6:04:40 AM
>>105910857

Can anyone explain how it is possible to run a 1T model at 200-300 tokens/second without quantizing it to death? Even on LPUs.

(see >>105910860, it did actually make greentext, there was just markdown mode enabled)
Anonymous /g/105906067#105910857
7/15/2025, 6:02:03 AM
>>105910665
>>105910689
this is fucking illegal btw. and I tested it on my programming questions, its basically the same normal K2, they didn't quantize it much. That speed is insane, also on the API I'm seeing up to 350 tokens/s in some cases.