as someone who is retarded and has a shitter laptop that can't run any llm loccally, is this my best option for something that's free? or should i just kms
Anonymous
8/2/2025, 6:01:46 PM
No.106117381
[Report]
>>106117422
>>106117339 (OP)
I'm sure you can run the small ones...
Anonymous
8/2/2025, 6:05:02 PM
No.106117406
[Report]
>>106117537
Why is the logo of every AI company a 6-pointed star?
Anonymous
8/2/2025, 6:07:15 PM
No.106117422
[Report]
>>106117450
>>106117381
hell no i got a pentium siver in this lenovo shit ideapad i can barely install ollama
Anonymous
8/2/2025, 6:09:55 PM
No.106117450
[Report]
>>106118200
>>106117422
also 4gb memory so tinyllama and phi 3 chug like goddamn thomas
Anonymous
8/2/2025, 6:13:26 PM
No.106117487
[Report]
>>106117458
blocked at my work unfortunately but i get around that by using shit r1 distills on WebLLM
Anonymous
8/2/2025, 6:18:49 PM
No.106117537
[Report]
>>106118538
>>106117406
It's either goatse or that jewish swastika star. OpenAI's logo is clearly an asshole.
Anonymous
8/2/2025, 6:21:49 PM
No.106117555
[Report]
>>106117696
>>106117339 (OP)
Kimi k2 is also decent and the web chat is free.
Anonymous
8/2/2025, 6:22:21 PM
No.106117558
[Report]
>>106117728
>>106117339 (OP)
Why does it have to be free? Just make an account at any of the 999999 LLM API companies and top it up with $20 (enough to fuck around and get an idea of what your usage is likely to be).
Anonymous
8/2/2025, 6:34:43 PM
No.106117696
[Report]
>>106117555
But Kimi needs registration, Qwen doesn't.
Anonymous
8/2/2025, 6:35:45 PM
No.106117709
[Report]
Diddling don rapes little kids.
Anonymous
8/2/2025, 6:36:52 PM
No.106117728
[Report]
>>106117759
>>106117558
why? im stingy as fuck and dont want to grind dumbass api sorry buddy
Anonymous
8/2/2025, 6:39:07 PM
No.106117759
[Report]
>>106117728
Bro just throw twenty bucks their way, you can't be that cheap??
Anonymous
8/2/2025, 6:40:56 PM
No.106117781
[Report]
>>106118100
surely you can run qwen3 0.6b
Anonymous
8/2/2025, 7:12:19 PM
No.106118100
[Report]
>>106117781
Also qwen 3 4b at q4 in 2.5 gigs
Anonymous
8/2/2025, 7:21:36 PM
No.106118188
[Report]
>>106118200
>>106117339 (OP)
depending on how much ram you have you can run it locally. I have 32gb on a mini pc and I run quantized qwen3:30b with ollama. with 8gb you should be able to run qwen3:7b.
Anonymous
8/2/2025, 7:44:06 PM
No.106118389
[Report]
>>106118483
create an openrouter account, minimally fund it, only use the free models.
Anonymous
8/2/2025, 7:56:31 PM
No.106118483
[Report]
>>106118389
this or litellm seems like the only solution here - and are they still trying to get wllama out of demo? that might be another avenue as well
Anonymous
8/2/2025, 8:10:45 PM
No.106118597
[Report]
>>106117339 (OP)
I have a 3090 and run qwen-coder-14b, it eats 18gb of vram and still sucks ass
Anonymous
8/2/2025, 9:35:06 PM
No.106119289
[Report]
I like Qwen better than DeepSeek in my limited experience
Is DeepSeek actually better tho?
Anonymous
8/3/2025, 1:45:03 AM
No.106121403
[Report]
>>106117339 (OP) can't you just get a laptop/pc that have have avx512? ollama supports that well now and it is not too slow to run some models with that, that is probably the cheapest way to run something local.
Anonymous
8/3/2025, 2:07:53 AM
No.106121607
[Report]
>>106117339 (OP)
Nothing you can run on a laptop is worth running.
Anonymous
8/3/2025, 2:13:50 AM
No.106121647
[Report]
>>106121668
>want to run local models
>have an amd gpu
Anonymous
8/3/2025, 2:16:44 AM
No.106121668
[Report]
>>106121647
>>have an amd gpu
so has lotsa video ram, and can run better models. lmao even. my gemma3 27b go brrrr
Anonymous
8/3/2025, 3:36:56 AM
No.106122249
[Report]
Qwen 3 A3B 30B Instruct and Thinking on llama.cpp at IQ4 NL quantization is the best I've found yet; with no GPU, 6 core i7, 16 GB of RAM. It runs fine, but lacks some knowledge, especially pop culture. Good at reasoning, good at non role-play chat, good at returning JSON for ad-hoc function calling from shell scripts. Local is getting better. Also, wait to see what OpenAI releases for local soon. If it's meant for mobile or edge devices, it will probably work on no-GPU laptops too.
Some don't know this, but A3B means 3 billion active parameters. It runs that fast, but has intelligence closer to a 30 billion parameter model. MoE is working good for those of us without GPU, at least on Linux with llama.cpp it is.