- underperformed on ARC-AGI
- in-line with expectations on METR
- 2024 knowledge cutoff, is really bad
- 400k context window, in-line with expectations
- not multimodal video input
- VERY competitive pricing
- solid improvements on hallucinations