← Home ← Back to /g/

Thread 105945105

64 posts 34 images /g/
Anonymous No.105945105 [Report] >>105945133 >>105945235 >>105945334 >>105945767 >>105945861 >>105947755 >>105947765 >>105948911 >>105949518 >>105949543 >>105952898
LLMs are a deadend
Trillions of dollars spent, trained on all the information that has ever existed and can't multiply reliably multiply numbers together?
Anonymous No.105945118 [Report] >>105945235 >>105945569 >>105947755 >>105952536 >>105953200
they predict the numbers likelihood of a token appearing next
>oh heres a bunch of text where letter 4 came after 2 plus 2 so its given higher value than all other letters

they dont do math
Anonymous No.105945133 [Report] >>105945151 >>105945330 >>105945457 >>105945856
>>105945105 (OP)
give them a calculator like you would to a person and then try again
>NO NOT LIKE TH-AAACK
Anonymous No.105945151 [Report] >>105945163 >>105945177 >>105945232 >>105945856
>>105945133
Humans don’t need a calculator for 10x20. Or at least, the people who understand the limits of LLMs don’t.
Anonymous No.105945163 [Report] >>105945185 >>105945350 >>105952212
>>105945151
5510887582 x 22919720567438908647
hurry up little buddy
Anonymous No.105945177 [Report]
>>105945151
>the people who understand the limits of LLMs don’t.
the people who understand the limits of LLMs know what tokenization is and how it affects math performance, kruger
Anonymous No.105945185 [Report] >>105945194 >>105945208
>>105945163
The meltdown of a porn addict who is already feeling less and less each time he uses ai.
Anonymous No.105945194 [Report] >>105945230
>>105945185
Are you projecting again?
Anonymous No.105945208 [Report]
>>105945185
i consider this response a concession
Anonymous No.105945230 [Report] >>105945244
>>105945194
Got any comebacks that don’t come from an elementary school playground? lmao AI brainlets are something else
Anonymous No.105945232 [Report] >>105945283
>>105945151
What about the people that are too retarded to understand the image in the OP and realize llms can do arithmetic better than a human now? This has nothing to do with math btw, it's true that they suck at math.
Anonymous No.105945235 [Report] >>105950928
>>105945105 (OP)
>>105945118
My dick didn't grew longer after installing mistral !
Fuck this, I'm moving to Thailand.
Anonymous No.105945244 [Report] >>105950844
>>105945230
Most humans can't even do 7x4
6048337 x 6258
Maybe some mathematicians or tricksters in the 1800s but not even those people care nowadays.
A system that has a bunch of little useful programs accessible via natural language to everyone is the death of the midwit LARPer.
Anonymous No.105945283 [Report] >>105951461
>>105945232
Ahaha sorry I sometimes forget this board is flooded with subhuman jeets who can’t do 5x5. I can see why some of you might think the AI is very lifelike, but to actual human beings the times tables are no big deal.
Anonymous No.105945330 [Report] >>105945372 >>105945856
>>105945133
>Alright now just supervise them on how to use every tool ever
Anonymous No.105945334 [Report]
>>105945105 (OP)
Because it's not "intelligence" and will never become such.
Anonymous No.105945350 [Report] >>105945391
>>105945163
So like, what if you need to multiply 10 and 20 specifically?
Anonymous No.105945372 [Report]
>>105945330
>
Anonymous No.105945391 [Report]
>>105945350
Then going by the OP it can do that 100% of the time for however many tests they did.
Anonymous No.105945457 [Report] >>105945856 >>105949584
>>105945133
>give the computer a calculator
Anonymous No.105945465 [Report] >>105945538
OK FAGGOT TELL ME THE ANSWER TO 12345678901234567890*100
OR THIS BIRD WILL FUCKING STAB YOU IN THE EYE!
Anonymous No.105945538 [Report] >>105945572
>>105945465
If you multiply by 100 you can just add two zeroes at the end of the number, anon-kun!
Anonymous No.105945549 [Report]
>this talking unicorn does a very poor job when asked to direct a feature film
Anonymous No.105945569 [Report] >>105945597
>>105945118
It's a language model not a math modern
If you actually wrote it out like twenty three plus seventy might have more luck
Anonymous No.105945572 [Report] >>105945586 >>105945609
>>105945538
OH YOU THINK YOU'RE FUCKING SMART HUH?
NOW DO 12345678901234567890*1000
Anonymous No.105945586 [Report] >>105952571
>>105945572
I... I don't know.
Hordu appu! You can solve it by just adding three free zeroes at the end of the number!
Anonymous No.105945597 [Report]
>>105945569
>If you actually wrote it out like twenty three plus seventy might have more luck
It still doesn't 'know' how to do math and it has tokens for numbers, you don't need to spell them out for them. If anything that might reduce accuracy because there's probably way less data where math is done using written numbers instead of the symbols.
Anonymous No.105945609 [Report]
>>105945572
12345678901234567890000
Anonymous No.105945767 [Report]
>>105945105 (OP)
who cares they write all the boilerplate code for me
Anonymous No.105945856 [Report]
>>105945133
>>105945151
>>105945330
>>105945457
That's what they are doing right now with <tool_call>. They are training them to use existing tools such as calculators so they don't have to store that information inside the model., freeing up a ton of space for other stuff.
Anonymous No.105945861 [Report]
>>105945105 (OP)
(You) can?
Anonymous No.105947755 [Report] >>105947863
>>105945105 (OP)
>>105945118
this is the dumbest possible use of a neural network, you're literally running a complex multi-dimensional math algorithm to simulate a retard and then asking it to do math
how about make it in pieces and have one piece determine that a maths problem is happening and parse that to fucking python and return the result
it's not that hard
Anonymous No.105947765 [Report]
>>105945105 (OP)
Anonymous No.105947863 [Report]
>>105947755
Or you could just use your brain, like a non-retard.
Anonymous No.105948911 [Report] >>105948994
>>105945105 (OP)
This is such an easy problem to fix. Just have the LLM recognize a math calculation question and then have it delegate that bit to a calculator service. Why in the holy fuck hasn't this been done yet? LLMs can Google search shit and even Google has a calculator service automatically answer basic calculation queries
Anonymous No.105948994 [Report]
>>105948911
>Why in the holy fuck hasn't this been done yet?
It has. It doesn't stop people from trying to joggle pure models to improve.
Anonymous No.105949518 [Report] >>105949531 >>105950552
>>105945105 (OP)
How do you cope? o3-mini is not even close to first place on the benchmarks nowadays
Anonymous No.105949531 [Report] >>105949545
>>105949518
The cope is the same.
>Why don't model weights build the correct algorithm and loop it.
Ignoring the fact that human brains don't work like this.
Anonymous No.105949543 [Report]
>>105945105 (OP)
Ok, now show the benchmark of regular humans multiplying two random 20 digits number in their head.
Anonymous No.105949545 [Report] >>105949611
>>105949531
They do if you add consciousness and the desire to follow the algorithm.
For true intelligence you need consciousness.
Anonymous No.105949584 [Report]
>>105945457

I think LLMs have lost the plot if we are sincerely asking this question.
Anonymous No.105949611 [Report] >>105949680
>>105949545
It does try to follow the algorithm, but the problem is it can't. It's not seeing the manual multiplication, it's just seeing something like "106,041,768\n+ 795,313,260\n+ 2,651,044,200\n+39,765,663,000". How many numbers can you multiply if you're only allowed to think with text?
Anonymous No.105949680 [Report] >>105950373
>>105949611
I does not try to follow any algorithm you are familiar with. The text it produces may appear to if you prompt it to, but the actual functioning doesn't.
An LLM doesn't think with text it takes text as input to a neural network. The text prediction algorithm it follows when asked to multiply is likely very complicated and specific to each input. Hence why it doesn't work in general and the discrepancy between close boxes in the plot
Anonymous No.105950373 [Report] >>105950784
>>105949680
If they weren't "thinking" even in a more abstract sense, then how do you explain chain of "thought" models being so much smarter?
Anonymous No.105950552 [Report]
>>105949518
>Look now my calculator is only wrong 40% of the time instead of 90%
Still a shit calculator that no reasonable person should be using. Now think of all the other ways it can be wrong that are far harder to catch then double checking with a calculator. Failed tool
Anonymous No.105950784 [Report]
>>105950373
>how do you explain chain of "thought" models being so much smarter?
Non-CoT models often outperform CoT models in "simple tasks".

This is why Zed and Claude Code default to Sonnet 4 instead of Sonnet 4 Thinking. Less token usage prevents context from exploding, and you can selectively enable CoT for complex sub-tasks within a prompt.
Anonymous No.105950844 [Report]
>>105945244
There is actually a system to add, subtract and multiply large numbers in your head. I read a book on it when I was younger and learned the addition but never finished the book. All the nerds at these maths things use the same system.
Anonymous No.105950928 [Report] >>105951427
>>105945235
why are burgers so obsessed with cuckshit?
Anonymous No.105951427 [Report] >>105952885
>>105950928
It's a coping mechanism to the environment men find themselves in.

Modern dating is basically cuckoldry.
You will make vows of lifelong love and fidelity at the altar to a woman that has opened her legs to several men before you. She has sucked penises of, gotten anally defiled by, swallowed semen from, and even likely rimmed assholes of, other men that put much less effort into fucking her than you did into marrying her. All that so that she will later think of these men fondly while you try to please her.
Anonymous No.105951461 [Report] >>105951753
>>105945283
I know this is bait, but the OP shows how long the numbers are. The element in the fifth row ans fifth column correspond to something like 12345×98760, not 5×5
Anonymous No.105951753 [Report] >>105952160
>>105951461
As someone else explained, that chart is showing it fails this every single time:
1234567890x100

Computing is the thing COMPUTERS are supposed to be best at, so it’s noteworthy that LLMs fail so badly. And it’s no wonder if you know the absolute basics about LLMs and how they work. But AI coomerbros want to have it both ways.
Anonymous No.105952160 [Report] >>105952548
>>105951753
(10 digits)x(3 digits) is above 90%, actually
Anonymous No.105952212 [Report]
>>105945163
>5510887582 x 22919720567438908647
1,263,272,981,903,155,902,903,879,654
chatgpt 4o
Anonymous No.105952370 [Report] >>105952501 >>105952550
ai can code
so why doesnt ai just realize it should code a solution
Anonymous No.105952501 [Report]
>>105952370
Isn't giving AI the ability to code itself how the world ends?
Anonymous No.105952521 [Report]
you've been able to give complex math problems to wolfram alpha in plain english for years
just stick a llm on top of that
Anonymous No.105952536 [Report]
>>105945118
Yeah, this board is retarded as fuck. How does a tech board not understand what an LLM is
Anonymous No.105952548 [Report] >>105952839
>>105952160
> reading is hard
maybe ask gpt to interpret the charts for you
Anonymous No.105952550 [Report]
>>105952370
chatgpt has been doing this under the hood with python sandboxes for quite some time now. im sure others do it too
Anonymous No.105952571 [Report]
>>105945586
Anonymous No.105952839 [Report]
>>105952548
It gets it right:
-96.2% of the times in row 10 column 3
-84.6% of the times in row 3 column 10

So yeah not always above 90%, fair enough, I only read the first value. My statement is still closer to the truth than "it fails this every single time", however. If I had to guess, I would also assume that multiplying by 100 has a way higher success rate than an arbitrary 3 digit number.
Anonymous No.105952885 [Report]
>>105951427
Sounds like you're just really into cuck shit
Anonymous No.105952898 [Report]
>>105945105 (OP)
give it a tool call. get with the times
Anonymous No.105953200 [Report]
>>105945118
Something is off.
These models can actually reason and even argue.
That should not happen with a mere text generator.