>>106350992
OK so this is what I am seeing.
Left is V3.1 early in the session. It's a worse version of Claude here and actually depressing/painful/tedious to use.
Right is "Dipsy mode" where you can ask for a change and it does what you want, not just what you say, and will run with it.
I hate to say it but the coomers impressions were correct: the initial autism mode is bad. It's IA not AI haha. But If you give it time (or context? or turns? not sure) you can break into Dipsy mode, and V3.1 is actually extremely capable.
It's much different from Kimi, Qwen, and even Claude in my experience.
>>106350034
Maybe I'll set up a neocities site or something to keep track of all of this. All of my testing for agentic coding are single html file with no external dependencies.
It's actually fun seeing the models set up their own landing page with agentic tools. I don't give a lot of specific UI tasks, so the outcome is sometimes very unique.