>>105620944
>so why are you even surprised that only cloud models are able to do that?
Because I was under the impression that the largest open source models would BTFO the -mini and Flash commercial models in all tasks. I was hoping somebody to prove me wrong but it seems that the vision capabilities in local models are just worse.
>Model A is a cloud model and model B is running on my own computer, model B is superior by default.
That is the case when you are using them for NSFW stuff. For automating boring rote work, not so much.