>>105654616
For Magistral you can take the official prompt and modify it a bit so the non-<think>'ing reply is only the character's responses and this works fine for chat uses.

>>105655036
>>105655147
>You can train and optimize an LLM all you want and it's not gonna be AGI.
>he is stuck in an infinite loop because LLMs are not conscious.
The consiousness problem likely isn't that hard to solve, yet so very few seem to try it. They may still do it when they realize it's needed to achieve good performance, same as evolution getting to it much earlier.
The "solution" to AGI with LLMs likely looks like implementing a few fixes such as:
- online learning, long and mid-term memory, either:
a. a way to put the context into weights directly (self-distill hidden states or logits), learning from single samples
b. indirectly by training something that tries compressing activations at various layers followed by injecting them when some other early layer activations trigger them, a way to remember and "recall", something beyond RAG
- a way to see its own thoughts, to achieve self-consciousness, for example:
a. a loop where late latents are looped back to early ones, for example by cross-attention to an early layer, so introducing recurrence. SGD doesn't play very well with recurrence, but I would guess there's many workarounds that would work.

continues