>>715671258
it's been known that they fake the benchmarks, like the devin scandal that didn't even run trials, just straight up posted their fake scores

there was in fact some very legit AI that did score high on genuine math tests for graduate students in like late 2023 but it was symbolic and very limited in the type of math it could do (i.e. it wasn't an llm or even a transformer model)