>>105611563
>never
but that was closed saying it could be revisited after refactoring, and they seemed to later do the refactor here:
https://github.com/ggml-org/llama.cpp/pull/12181
>Introduce llm_memory_i concept that will abstract different cache/memory mechanisms. For now we have only llama_kv_cache as a type of memory
and looks like work has picked up on other models with competing cache mechanisms (mamba etc.)
https://github.com/ggml-org/llama.cpp/pull/13979

now we just need someone with motivation, a vibe coding client, and good enough prompt engineering skills to revisit minimax and we're fucking IN