Search Results
7/3/2025, 4:50:11 AM
What's the minimum sized model for somewhat accurate image to text completion?
i want to send my chatbots images and upgrade from the over a year old mixtral 7x8 model i have been using, but the newish 7-12b models i have tried either crash when i try to launch them or thinks
>image related
is somewhere between a pancake breakfast, exotic butterfly or a man resting on a bed.
but cant tell that its a cat.
i dread what bullshit these models might try to pull if anything explicit is sent their way, is small image to text models just useless and i need to shill out for more vram? (tried both local and multimodal in ST)
i want to send my chatbots images and upgrade from the over a year old mixtral 7x8 model i have been using, but the newish 7-12b models i have tried either crash when i try to launch them or thinks
>image related
is somewhere between a pancake breakfast, exotic butterfly or a man resting on a bed.
but cant tell that its a cat.
i dread what bullshit these models might try to pull if anything explicit is sent their way, is small image to text models just useless and i need to shill out for more vram? (tried both local and multimodal in ST)
Page 1