Search Results
7/2/2025, 6:06:10 PM
So that's why sometimes I see the context get reprocessed for seemingly no reason.
What the hell, how is this not a priority bug?
I get that there are only so many hands that can actually fix something like this, but still.
Maybe they could cache the plain text alongside the kv cache and the equivalent logits and use that for each prompt or whatever instead of re-tokenizing the prompt every time.
What the hell, how is this not a priority bug?
I get that there are only so many hands that can actually fix something like this, but still.
Maybe they could cache the plain text alongside the kv cache and the equivalent logits and use that for each prompt or whatever instead of re-tokenizing the prompt every time.
Page 1