Search Results

Found 1 results for "daecaa02bf775783d62ad803935415bf" across all boards searching md5.

Anonymous /v/717577275#717578796
8/8/2025, 12:38:26 PM
>>717578429
Opus 4.1 benchmarks several percent better across the board at agentic tasks and planning compared to Opus 4. Even small improvements starts compounding the longer the task.

There's around a 10% benchmark gap between sonnet 3.7 and opus 4.0. 4-5% gap between 4.0 and 4.1.