>>106292715
>>106292729

The local SOTA for vlm is InternVL3, you can run it on ollama if you have a good enough GPU and enough RAM (if you offload to RAM)

https://huggingface.co/spaces/opencompass/open_vlm_leaderboard


>JoyCaption
This meme has to die, that thing is only useful if you are captioning for porn, it's not suitable for anything else
>Florence
If you are OK with low quality captions, sure