Report Content - 4rchive

>but wait, didn't...
>but wait, wasn't...
>but wait...
Is this common for MoE thinking models or unrelated? Anyway, GLM4.5V with reasoning fucking sucks dick. I created my own vision benchmark on my business docs and it failed horribly in every way possible. Did the same benchmark with Gemini Pro 2.5 + Thinking and it aced it, even making me realize that one of the test answers in the benchmark was wrong because I missed critical information in one of the rows. Maybe not fair to compare a 108B model against 800B or whatever Gemini2.5Pro is, but since this is one of the top opensauce VLMs right now, why even fucking bother with local. seriously.

Report

Post Preview