Search Results
6/23/2025, 6:16:57 AM
>>105661997
>It forgets to use the <think> tags and just shits out its reasoning as-is
What I see LongWriter-Zero do is shit out its reasoning as-is, then include more thoughts inside <think> tags, then an answer in <answer> tags (with a colon after the close </answer>:), then more thoughts inside <think> tags, then more output inside <answer> tags, and so forth within a single message.
>The model page recommends this format <|user|>: {question} <|assistant|> but that gave me totally schizo (and chinese) responses. Using the qwen2 format is better imo.
To sidestep this bullshit I used llama.cpp's OpenAI-style chat completion endpoint and the jinja template. No system prompt or anything other than what the template itself adds.
>it repeats itself a lot
Yes.
>It forgets to use the <think> tags and just shits out its reasoning as-is
What I see LongWriter-Zero do is shit out its reasoning as-is, then include more thoughts inside <think> tags, then an answer in <answer> tags (with a colon after the close </answer>:), then more thoughts inside <think> tags, then more output inside <answer> tags, and so forth within a single message.
>The model page recommends this format <|user|>: {question} <|assistant|> but that gave me totally schizo (and chinese) responses. Using the qwen2 format is better imo.
To sidestep this bullshit I used llama.cpp's OpenAI-style chat completion endpoint and the jinja template. No system prompt or anything other than what the template itself adds.
>it repeats itself a lot
Yes.
Page 1