>>512649083
>>512649312
simplification seems like a roulette
they make giant crisp models, then watered them down for fast, lighter work, and they start to commit very retarded mistakes
the problem is the structure itself, everyone is aware of this
the next-gen SI chatbots are not going to be a single LLM 45 gazillion parameter model that requires 245 A100s to say "hello", but probably something like 4-6 interacting 8B models combined in a reasoning loop
the next step is not going to be grok4heavy destroying phd-level math questions, but a 32B model that can count how many Rs are in raspberry and overall be more like an average person than an autistic savant