Search Results
8/8/2025, 12:38:26 PM
>>717578429
Opus 4.1 benchmarks several percent better across the board at agentic tasks and planning compared to Opus 4. Even small improvements starts compounding the longer the task.
There's around a 10% benchmark gap between sonnet 3.7 and opus 4.0. 4-5% gap between 4.0 and 4.1.
Opus 4.1 benchmarks several percent better across the board at agentic tasks and planning compared to Opus 4. Even small improvements starts compounding the longer the task.
There's around a 10% benchmark gap between sonnet 3.7 and opus 4.0. 4-5% gap between 4.0 and 4.1.
Page 1