LLMs are a deadend - /g/ (#105945105) [Archived: 178 hours ago]

Anonymous
7/18/2025, 10:32:12 AM No.105945105
GXsps41W4AEAY0c
GXsps41W4AEAY0c
md5: 4ddfeff6e054d5fa0114800e1d70f18a๐Ÿ”
Trillions of dollars spent, trained on all the information that has ever existed and can't multiply reliably multiply numbers together?
Replies: >>105945133 >>105945235 >>105945334 >>105945767 >>105945861 >>105947755 >>105947765 >>105948911 >>105949518 >>105949543 >>105952898
Anonymous
7/18/2025, 10:34:40 AM No.105945118
they predict the numbers likelihood of a token appearing next
>oh heres a bunch of text where letter 4 came after 2 plus 2 so its given higher value than all other letters

they dont do math
Replies: >>105945235 >>105945569 >>105947755 >>105952536 >>105953200
Anonymous
7/18/2025, 10:38:09 AM No.105945133
>>105945105 (OP)
give them a calculator like you would to a person and then try again
>NO NOT LIKE TH-AAACK
Replies: >>105945151 >>105945330 >>105945457 >>105945856
Anonymous
7/18/2025, 10:41:24 AM No.105945151
IMG_0834
IMG_0834
md5: 54100f956da143e5ee127557f60623ca๐Ÿ”
>>105945133
Humans donโ€™t need a calculator for 10x20. Or at least, the people who understand the limits of LLMs donโ€™t.
Replies: >>105945163 >>105945177 >>105945232 >>105945856
Anonymous
7/18/2025, 10:44:42 AM No.105945163
>>105945151
5510887582 x 22919720567438908647
hurry up little buddy
Replies: >>105945185 >>105945350 >>105952212
Anonymous
7/18/2025, 10:46:18 AM No.105945177
>>105945151
>the people who understand the limits of LLMs donโ€™t.
the people who understand the limits of LLMs know what tokenization is and how it affects math performance, kruger
Anonymous
7/18/2025, 10:48:54 AM No.105945185
>>105945163
The meltdown of a porn addict who is already feeling less and less each time he uses ai.
Replies: >>105945194 >>105945208
Anonymous
7/18/2025, 10:49:51 AM No.105945194
>>105945185
Are you projecting again?
Replies: >>105945230
Anonymous
7/18/2025, 10:52:32 AM No.105945208
>>105945185
i consider this response a concession
Anonymous
7/18/2025, 10:56:42 AM No.105945230
IMG_0438
IMG_0438
md5: 1ca2454a0ef44ec3817ea8a48d4c81f3๐Ÿ”
>>105945194
Got any comebacks that donโ€™t come from an elementary school playground? lmao AI brainlets are something else
Replies: >>105945244
Anonymous
7/18/2025, 10:57:00 AM No.105945232
>>105945151
What about the people that are too retarded to understand the image in the OP and realize llms can do arithmetic better than a human now? This has nothing to do with math btw, it's true that they suck at math.
Replies: >>105945283
Anonymous
7/18/2025, 10:57:41 AM No.105945235
1694489825334770
1694489825334770
md5: 4706731d00b418ba58ae5cd30ad1b1d0๐Ÿ”
>>105945105 (OP)
>>105945118
My dick didn't grew longer after installing mistral !
Fuck this, I'm moving to Thailand.
Replies: >>105950928
Anonymous
7/18/2025, 10:58:53 AM No.105945244
>>105945230
Most humans can't even do 7x4
6048337 x 6258
Maybe some mathematicians or tricksters in the 1800s but not even those people care nowadays.
A system that has a bunch of little useful programs accessible via natural language to everyone is the death of the midwit LARPer.
Replies: >>105950844
Anonymous
7/18/2025, 11:05:29 AM No.105945283
IMG_0569
IMG_0569
md5: 61da0bf05a87ff1b98f83408237e4c76๐Ÿ”
>>105945232
Ahaha sorry I sometimes forget this board is flooded with subhuman jeets who canโ€™t do 5x5. I can see why some of you might think the AI is very lifelike, but to actual human beings the times tables are no big deal.
Replies: >>105951461
Anonymous
7/18/2025, 11:14:01 AM No.105945330
>>105945133
>Alright now just supervise them on how to use every tool ever
Replies: >>105945372 >>105945856
Anonymous
7/18/2025, 11:14:41 AM No.105945334
>>105945105 (OP)
Because it's not "intelligence" and will never become such.
Anonymous
7/18/2025, 11:17:55 AM No.105945350
1692790751503861
1692790751503861
md5: 66ec2ad6fa159114ac264cf381dba5e0๐Ÿ”
>>105945163
So like, what if you need to multiply 10 and 20 specifically?
Replies: >>105945391
Anonymous
7/18/2025, 11:21:34 AM No.105945372
1
1
md5: 816783e9c266d018c6ccc0b9bc963971๐Ÿ”
>>105945330
>
Anonymous
7/18/2025, 11:24:02 AM No.105945391
>>105945350
Then going by the OP it can do that 100% of the time for however many tests they did.
Anonymous
7/18/2025, 11:37:15 AM No.105945457
>>105945133
>give the computer a calculator
Replies: >>105945856 >>105949584
Anonymous
7/18/2025, 11:37:53 AM No.105945465
knife bird
knife bird
md5: 9f228c58ce3b287402aa6186cf76a3ed๐Ÿ”
OK FAGGOT TELL ME THE ANSWER TO 12345678901234567890*100
OR THIS BIRD WILL FUCKING STAB YOU IN THE EYE!
Replies: >>105945538
Anonymous
7/18/2025, 11:49:13 AM No.105945538
>>105945465
If you multiply by 100 you can just add two zeroes at the end of the number, anon-kun!
Replies: >>105945572
Anonymous
7/18/2025, 11:50:26 AM No.105945549
>this talking unicorn does a very poor job when asked to direct a feature film
Anonymous
7/18/2025, 11:53:17 AM No.105945569
>>105945118
It's a language model not a math modern
If you actually wrote it out like twenty three plus seventy might have more luck
Replies: >>105945597
Anonymous
7/18/2025, 11:53:28 AM No.105945572
kisama crow
kisama crow
md5: 5930de83b785c6515e05a2417ef46500๐Ÿ”
>>105945538
OH YOU THINK YOU'RE FUCKING SMART HUH?
NOW DO 12345678901234567890*1000
Replies: >>105945586 >>105945609
Anonymous
7/18/2025, 11:55:43 AM No.105945586
1728181729646435
1728181729646435
md5: 2025fc1fb78fd5158af3309c939d0bf2๐Ÿ”
>>105945572
I... I don't know.
Hordu appu! You can solve it by just adding three free zeroes at the end of the number!
Replies: >>105952571
Anonymous
7/18/2025, 11:57:02 AM No.105945597
>>105945569
>If you actually wrote it out like twenty three plus seventy might have more luck
It still doesn't 'know' how to do math and it has tokens for numbers, you don't need to spell them out for them. If anything that might reduce accuracy because there's probably way less data where math is done using written numbers instead of the symbols.
Anonymous
7/18/2025, 11:58:37 AM No.105945609
images
images
md5: e2d74698f9e10a6b01d7d4f5d1d731f1๐Ÿ”
>>105945572
12345678901234567890000
Anonymous
7/18/2025, 12:27:05 PM No.105945767
>>105945105 (OP)
who cares they write all the boilerplate code for me
Anonymous
7/18/2025, 12:50:10 PM No.105945856
>>105945133
>>105945151
>>105945330
>>105945457
That's what they are doing right now with <tool_call>. They are training them to use existing tools such as calculators so they don't have to store that information inside the model., freeing up a ton of space for other stuff.
Anonymous
7/18/2025, 12:51:55 PM No.105945861
>>105945105 (OP)
(You) can?
Anonymous
7/18/2025, 6:00:19 PM No.105947755
>>105945105 (OP)
>>105945118
this is the dumbest possible use of a neural network, you're literally running a complex multi-dimensional math algorithm to simulate a retard and then asking it to do math
how about make it in pieces and have one piece determine that a maths problem is happening and parse that to fucking python and return the result
it's not that hard
Replies: >>105947863
Anonymous
7/18/2025, 6:01:20 PM No.105947765
file
file
md5: 43f7fd75034c8df3a6ec9875b3eaadee๐Ÿ”
>>105945105 (OP)
Anonymous
7/18/2025, 6:17:15 PM No.105947863
>>105947755
Or you could just use your brain, like a non-retard.
Anonymous
7/18/2025, 8:08:42 PM No.105948911
>>105945105 (OP)
This is such an easy problem to fix. Just have the LLM recognize a math calculation question and then have it delegate that bit to a calculator service. Why in the holy fuck hasn't this been done yet? LLMs can Google search shit and even Google has a calculator service automatically answer basic calculation queries
Replies: >>105948994
Anonymous
7/18/2025, 8:17:54 PM No.105948994
>>105948911
>Why in the holy fuck hasn't this been done yet?
It has. It doesn't stop people from trying to joggle pure models to improve.
Anonymous
7/18/2025, 9:18:55 PM No.105949518
multi-digit-multiplication-performance-by-oai-models-v0-uo5ze0hrm1je1
>>105945105 (OP)
How do you cope? o3-mini is not even close to first place on the benchmarks nowadays
Replies: >>105949531 >>105950552
Anonymous
7/18/2025, 9:21:03 PM No.105949531
>>105949518
The cope is the same.
>Why don't model weights build the correct algorithm and loop it.
Ignoring the fact that human brains don't work like this.
Replies: >>105949545
Anonymous
7/18/2025, 9:22:08 PM No.105949543
>>105945105 (OP)
Ok, now show the benchmark of regular humans multiplying two random 20 digits number in their head.
Anonymous
7/18/2025, 9:22:16 PM No.105949545
>>105949531
They do if you add consciousness and the desire to follow the algorithm.
For true intelligence you need consciousness.
Replies: >>105949611
Anonymous
7/18/2025, 9:27:12 PM No.105949584
>>105945457

I think LLMs have lost the plot if we are sincerely asking this question.
Anonymous
7/18/2025, 9:30:23 PM No.105949611
>>105949545
It does try to follow the algorithm, but the problem is it can't. It's not seeing the manual multiplication, it's just seeing something like "106,041,768\n+ 795,313,260\n+ 2,651,044,200\n+39,765,663,000". How many numbers can you multiply if you're only allowed to think with text?
Replies: >>105949680
Anonymous
7/18/2025, 9:36:41 PM No.105949680
>>105949611
I does not try to follow any algorithm you are familiar with. The text it produces may appear to if you prompt it to, but the actual functioning doesn't.
An LLM doesn't think with text it takes text as input to a neural network. The text prediction algorithm it follows when asked to multiply is likely very complicated and specific to each input. Hence why it doesn't work in general and the discrepancy between close boxes in the plot
Replies: >>105950373
Anonymous
7/18/2025, 10:51:21 PM No.105950373
>>105949680
If they weren't "thinking" even in a more abstract sense, then how do you explain chain of "thought" models being so much smarter?
Replies: >>105950784
Anonymous
7/18/2025, 11:08:27 PM No.105950552
>>105949518
>Look now my calculator is only wrong 40% of the time instead of 90%
Still a shit calculator that no reasonable person should be using. Now think of all the other ways it can be wrong that are far harder to catch then double checking with a calculator. Failed tool
Anonymous
7/18/2025, 11:28:34 PM No.105950784
benchmark
benchmark
md5: f7838af6a5ed7effa0cc07c9456ab3d8๐Ÿ”
>>105950373
>how do you explain chain of "thought" models being so much smarter?
Non-CoT models often outperform CoT models in "simple tasks".

This is why Zed and Claude Code default to Sonnet 4 instead of Sonnet 4 Thinking. Less token usage prevents context from exploding, and you can selectively enable CoT for complex sub-tasks within a prompt.
Anonymous
7/18/2025, 11:34:34 PM No.105950844
>>105945244
There is actually a system to add, subtract and multiply large numbers in your head. I read a book on it when I was younger and learned the addition but never finished the book. All the nerds at these maths things use the same system.
Anonymous
7/18/2025, 11:42:46 PM No.105950928
>>105945235
why are burgers so obsessed with cuckshit?
Replies: >>105951427
Anonymous
7/19/2025, 12:44:04 AM No.105951427
1740424146450
1740424146450
md5: 286dd07eb009e656dd177e71dba1a098๐Ÿ”
>>105950928
It's a coping mechanism to the environment men find themselves in.

Modern dating is basically cuckoldry.
You will make vows of lifelong love and fidelity at the altar to a woman that has opened her legs to several men before you. She has sucked penises of, gotten anally defiled by, swallowed semen from, and even likely rimmed assholes of, other men that put much less effort into fucking her than you did into marrying her. All that so that she will later think of these men fondly while you try to please her.
Replies: >>105952885
Anonymous
7/19/2025, 12:47:30 AM No.105951461
>>105945283
I know this is bait, but the OP shows how long the numbers are. The element in the fifth row ans fifth column correspond to something like 12345ร—98760, not 5ร—5
Replies: >>105951753
Anonymous
7/19/2025, 1:32:05 AM No.105951753
IMG_0719
IMG_0719
md5: c0326320e3d584bf213398ff72d1ac65๐Ÿ”
>>105951461
As someone else explained, that chart is showing it fails this every single time:
1234567890x100

Computing is the thing COMPUTERS are supposed to be best at, so itโ€™s noteworthy that LLMs fail so badly. And itโ€™s no wonder if you know the absolute basics about LLMs and how they work. But AI coomerbros want to have it both ways.
Replies: >>105952160
Anonymous
7/19/2025, 2:37:23 AM No.105952160
>>105951753
(10 digits)x(3 digits) is above 90%, actually
Replies: >>105952548
Anonymous
7/19/2025, 2:46:42 AM No.105952212
>>105945163
>5510887582 x 22919720567438908647
1,263,272,981,903,155,902,903,879,654
chatgpt 4o
Anonymous
7/19/2025, 3:06:40 AM No.105952370
ai can code
so why doesnt ai just realize it should code a solution
Replies: >>105952501 >>105952550
Anonymous
7/19/2025, 3:23:38 AM No.105952501
>>105952370
Isn't giving AI the ability to code itself how the world ends?
Anonymous
7/19/2025, 3:25:58 AM No.105952521
you've been able to give complex math problems to wolfram alpha in plain english for years
just stick a llm on top of that
Anonymous
7/19/2025, 3:28:01 AM No.105952536
>>105945118
Yeah, this board is retarded as fuck. How does a tech board not understand what an LLM is
Anonymous
7/19/2025, 3:29:36 AM No.105952548
>>105952160
> reading is hard
maybe ask gpt to interpret the charts for you
Replies: >>105952839
Anonymous
7/19/2025, 3:30:03 AM No.105952550
>>105952370
chatgpt has been doing this under the hood with python sandboxes for quite some time now. im sure others do it too
Anonymous
7/19/2025, 3:33:10 AM No.105952571
1748451683919556
1748451683919556
md5: 3d2518b84e875aa5913412bc80940364๐Ÿ”
>>105945586
Anonymous
7/19/2025, 4:20:58 AM No.105952839
>>105952548
It gets it right:
-96.2% of the times in row 10 column 3
-84.6% of the times in row 3 column 10

So yeah not always above 90%, fair enough, I only read the first value. My statement is still closer to the truth than "it fails this every single time", however. If I had to guess, I would also assume that multiplying by 100 has a way higher success rate than an arbitrary 3 digit number.
Anonymous
7/19/2025, 4:31:53 AM No.105952885
>>105951427
Sounds like you're just really into cuck shit
Anonymous
7/19/2025, 4:35:08 AM No.105952898
>>105945105 (OP)
give it a tool call. get with the times
Anonymous
7/19/2025, 5:14:58 AM No.105953200
>>105945118
Something is off.
These models can actually reason and even argue.
That should not happen with a mere text generator.