Thread 105945105

64 posts 34 images /g/

Anonymous 7/18/2025, 10:32:12 AM No.105945105 [Report] >>105945133 >>105945235 >>105945334 >>105945767 >>105945861 >>105947755 >>105947765 >>105948911 >>105949518 >>105949543 >>105952898

LLMs are a deadend

GXsps41W4AEAY0c.jpg md5: 4ddfeff6...

Trillions of dollars spent, trained on all the information that has ever existed and can't multiply reliably multiply numbers together?

Anonymous 7/18/2025, 10:34:40 AM No.105945118 [Report] >>105945235 >>105945569 >>105947755 >>105952536 >>105953200

they predict the numbers likelihood of a token appearing next
>oh heres a bunch of text where letter 4 came after 2 plus 2 so its given higher value than all other letters

they dont do math

Anonymous 7/18/2025, 10:38:09 AM No.105945133 [Report] >>105945151 >>105945330 >>105945457 >>105945856

>>105945105 (OP)
give them a calculator like you would to a person and then try again
>NO NOT LIKE TH-AAACK

Anonymous 7/18/2025, 10:41:24 AM No.105945151 [Report] >>105945163 >>105945177 >>105945232 >>105945856

IMG_0834.jpg md5: 54100f95...

>>105945133
Humans don’t need a calculator for 10x20. Or at least, the people who understand the limits of LLMs don’t.

Anonymous 7/18/2025, 10:44:42 AM No.105945163 [Report] >>105945185 >>105945350 >>105952212

>>105945151
5510887582 x 22919720567438908647
hurry up little buddy

Anonymous 7/18/2025, 10:46:18 AM No.105945177 [Report]

>>105945151
>the people who understand the limits of LLMs don’t.
the people who understand the limits of LLMs know what tokenization is and how it affects math performance, kruger

Anonymous 7/18/2025, 10:48:54 AM No.105945185 [Report] >>105945194 >>105945208

>>105945163
The meltdown of a porn addict who is already feeling less and less each time he uses ai.

Anonymous 7/18/2025, 10:49:51 AM No.105945194 [Report] >>105945230

>>105945185
Are you projecting again?

Anonymous 7/18/2025, 10:52:32 AM No.105945208 [Report]

>>105945185
i consider this response a concession

Anonymous 7/18/2025, 10:56:42 AM No.105945230 [Report] >>105945244

IMG_0438.gif md5: 1ca2454a...

>>105945194
Got any comebacks that don’t come from an elementary school playground? lmao AI brainlets are something else

Anonymous 7/18/2025, 10:57:00 AM No.105945232 [Report] >>105945283

>>105945151
What about the people that are too retarded to understand the image in the OP and realize llms can do arithmetic better than a human now? This has nothing to do with math btw, it's true that they suck at math.

Anonymous 7/18/2025, 10:57:41 AM No.105945235 [Report] >>105950928

1694489825334770.jpg md5: 4706731d...

>>105945105 (OP)
>>105945118
My dick didn't grew longer after installing mistral !
Fuck this, I'm moving to Thailand.

Anonymous 7/18/2025, 10:58:53 AM No.105945244 [Report] >>105950844

>>105945230
Most humans can't even do 7x4
6048337 x 6258
Maybe some mathematicians or tricksters in the 1800s but not even those people care nowadays.
A system that has a bunch of little useful programs accessible via natural language to everyone is the death of the midwit LARPer.

Anonymous 7/18/2025, 11:05:29 AM No.105945283 [Report] >>105951461

IMG_0569.jpg md5: 61da0bf0...

>>105945232
Ahaha sorry I sometimes forget this board is flooded with subhuman jeets who can’t do 5x5. I can see why some of you might think the AI is very lifelike, but to actual human beings the times tables are no big deal.

Anonymous 7/18/2025, 11:14:01 AM No.105945330 [Report] >>105945372 >>105945856

>>105945133
>Alright now just supervise them on how to use every tool ever

Anonymous 7/18/2025, 11:14:41 AM No.105945334 [Report]

>>105945105 (OP)
Because it's not "intelligence" and will never become such.

Anonymous 7/18/2025, 11:17:55 AM No.105945350 [Report] >>105945391

1692790751503861.jpg md5: 66ec2ad6...

>>105945163
So like, what if you need to multiply 10 and 20 specifically?

Anonymous 7/18/2025, 11:21:34 AM No.105945372 [Report]

1.png md5: 816783e9...

>>105945330
>

Anonymous 7/18/2025, 11:24:02 AM No.105945391 [Report]

>>105945350
Then going by the OP it can do that 100% of the time for however many tests they did.

Anonymous 7/18/2025, 11:37:15 AM No.105945457 [Report] >>105945856 >>105949584

>>105945133
>give the computer a calculator

Anonymous 7/18/2025, 11:37:53 AM No.105945465 [Report] >>105945538

knife bird.jpg md5: 9f228c58...

OK FAGGOT TELL ME THE ANSWER TO 12345678901234567890*100
OR THIS BIRD WILL FUCKING STAB YOU IN THE EYE!

Anonymous 7/18/2025, 11:49:13 AM No.105945538 [Report] >>105945572

>>105945465
If you multiply by 100 you can just add two zeroes at the end of the number, anon-kun!

Anonymous 7/18/2025, 11:50:26 AM No.105945549 [Report]

>this talking unicorn does a very poor job when asked to direct a feature film

Anonymous 7/18/2025, 11:53:17 AM No.105945569 [Report] >>105945597

>>105945118
It's a language model not a math modern
If you actually wrote it out like twenty three plus seventy might have more luck

Anonymous 7/18/2025, 11:53:28 AM No.105945572 [Report] >>105945586 >>105945609

kisama crow.jpg md5: 5930de83...

>>105945538
OH YOU THINK YOU'RE FUCKING SMART HUH?
NOW DO 12345678901234567890*1000

Anonymous 7/18/2025, 11:55:43 AM No.105945586 [Report] >>105952571

1728181729646435.jpg md5: 2025fc1f...

>>105945572
I... I don't know.
Hordu appu! You can solve it by just adding three free zeroes at the end of the number!

Anonymous 7/18/2025, 11:57:02 AM No.105945597 [Report]

>>105945569
>If you actually wrote it out like twenty three plus seventy might have more luck
It still doesn't 'know' how to do math and it has tokens for numbers, you don't need to spell them out for them. If anything that might reduce accuracy because there's probably way less data where math is done using written numbers instead of the symbols.

Anonymous 7/18/2025, 11:58:37 AM No.105945609 [Report]

images.jpg md5: e2d74698...

>>105945572
12345678901234567890000

Anonymous 7/18/2025, 12:27:05 PM No.105945767 [Report]

>>105945105 (OP)
who cares they write all the boilerplate code for me

Anonymous 7/18/2025, 12:50:10 PM No.105945856 [Report]

>>105945133
>>105945151
>>105945330
>>105945457
That's what they are doing right now with <tool_call>. They are training them to use existing tools such as calculators so they don't have to store that information inside the model., freeing up a ton of space for other stuff.

Anonymous 7/18/2025, 12:51:55 PM No.105945861 [Report]

>>105945105 (OP)
(You) can?

Anonymous 7/18/2025, 6:00:19 PM No.105947755 [Report] >>105947863

>>105945105 (OP)
>>105945118
this is the dumbest possible use of a neural network, you're literally running a complex multi-dimensional math algorithm to simulate a retard and then asking it to do math
how about make it in pieces and have one piece determine that a maths problem is happening and parse that to fucking python and return the result
it's not that hard

Anonymous 7/18/2025, 6:01:20 PM No.105947765 [Report]

file.png md5: 43f7fd75...

>>105945105 (OP)

Anonymous 7/18/2025, 6:17:15 PM No.105947863 [Report]

>>105947755
Or you could just use your brain, like a non-retard.

Anonymous 7/18/2025, 8:08:42 PM No.105948911 [Report] >>105948994

>>105945105 (OP)
This is such an easy problem to fix. Just have the LLM recognize a math calculation question and then have it delegate that bit to a calculator service. Why in the holy fuck hasn't this been done yet? LLMs can Google search shit and even Google has a calculator service automatically answer basic calculation queries

Anonymous 7/18/2025, 8:17:54 PM No.105948994 [Report]

>>105948911
>Why in the holy fuck hasn't this been done yet?
It has. It doesn't stop people from trying to joggle pure models to improve.

Anonymous 7/18/2025, 9:18:55 PM No.105949518 [Report] >>105949531 >>105950552

multi-digit-multiplication-performance-by-oai-models-v0-uo5ze0hrm1je1.png md5: 20fe208e...

>>105945105 (OP)
How do you cope? o3-mini is not even close to first place on the benchmarks nowadays

Anonymous 7/18/2025, 9:21:03 PM No.105949531 [Report] >>105949545

>>105949518
The cope is the same.
>Why don't model weights build the correct algorithm and loop it.
Ignoring the fact that human brains don't work like this.

Anonymous 7/18/2025, 9:22:08 PM No.105949543 [Report]

>>105945105 (OP)
Ok, now show the benchmark of regular humans multiplying two random 20 digits number in their head.

Anonymous 7/18/2025, 9:22:16 PM No.105949545 [Report] >>105949611

>>105949531
They do if you add consciousness and the desire to follow the algorithm.
For true intelligence you need consciousness.

Anonymous 7/18/2025, 9:27:12 PM No.105949584 [Report]

>>105945457

I think LLMs have lost the plot if we are sincerely asking this question.

Anonymous 7/18/2025, 9:30:23 PM No.105949611 [Report] >>105949680

>>105949545
It does try to follow the algorithm, but the problem is it can't. It's not seeing the manual multiplication, it's just seeing something like "106,041,768\n+ 795,313,260\n+ 2,651,044,200\n+39,765,663,000". How many numbers can you multiply if you're only allowed to think with text?

Anonymous 7/18/2025, 9:36:41 PM No.105949680 [Report] >>105950373

>>105949611
I does not try to follow any algorithm you are familiar with. The text it produces may appear to if you prompt it to, but the actual functioning doesn't.
An LLM doesn't think with text it takes text as input to a neural network. The text prediction algorithm it follows when asked to multiply is likely very complicated and specific to each input. Hence why it doesn't work in general and the discrepancy between close boxes in the plot

Anonymous 7/18/2025, 10:51:21 PM No.105950373 [Report] >>105950784

>>105949680
If they weren't "thinking" even in a more abstract sense, then how do you explain chain of "thought" models being so much smarter?

Anonymous 7/18/2025, 11:08:27 PM No.105950552 [Report]

>>105949518
>Look now my calculator is only wrong 40% of the time instead of 90%
Still a shit calculator that no reasonable person should be using. Now think of all the other ways it can be wrong that are far harder to catch then double checking with a calculator. Failed tool

Anonymous 7/18/2025, 11:28:34 PM No.105950784 [Report]

benchmark.png md5: f7838af6...

>>105950373
>how do you explain chain of "thought" models being so much smarter?
Non-CoT models often outperform CoT models in "simple tasks".

This is why Zed and Claude Code default to Sonnet 4 instead of Sonnet 4 Thinking. Less token usage prevents context from exploding, and you can selectively enable CoT for complex sub-tasks within a prompt.

Anonymous 7/18/2025, 11:34:34 PM No.105950844 [Report]

>>105945244
There is actually a system to add, subtract and multiply large numbers in your head. I read a book on it when I was younger and learned the addition but never finished the book. All the nerds at these maths things use the same system.

Anonymous 7/18/2025, 11:42:46 PM No.105950928 [Report] >>105951427

>>105945235
why are burgers so obsessed with cuckshit?

Anonymous 7/19/2025, 12:44:04 AM No.105951427 [Report] >>105952885

1740424146450.jpg md5: 286dd07e...

>>105950928
It's a coping mechanism to the environment men find themselves in.

Modern dating is basically cuckoldry.
You will make vows of lifelong love and fidelity at the altar to a woman that has opened her legs to several men before you. She has sucked penises of, gotten anally defiled by, swallowed semen from, and even likely rimmed assholes of, other men that put much less effort into fucking her than you did into marrying her. All that so that she will later think of these men fondly while you try to please her.

Anonymous 7/19/2025, 12:47:30 AM No.105951461 [Report] >>105951753

>>105945283
I know this is bait, but the OP shows how long the numbers are. The element in the fifth row ans fifth column correspond to something like 12345×98760, not 5×5

Anonymous 7/19/2025, 1:32:05 AM No.105951753 [Report] >>105952160

IMG_0719.png md5: c0326320...

>>105951461
As someone else explained, that chart is showing it fails this every single time:
1234567890x100

Computing is the thing COMPUTERS are supposed to be best at, so it’s noteworthy that LLMs fail so badly. And it’s no wonder if you know the absolute basics about LLMs and how they work. But AI coomerbros want to have it both ways.

Anonymous 7/19/2025, 2:37:23 AM No.105952160 [Report] >>105952548

>>105951753
(10 digits)x(3 digits) is above 90%, actually

Anonymous 7/19/2025, 2:46:42 AM No.105952212 [Report]

>>105945163
>5510887582 x 22919720567438908647
1,263,272,981,903,155,902,903,879,654
chatgpt 4o

Anonymous 7/19/2025, 3:06:40 AM No.105952370 [Report] >>105952501 >>105952550

ai can code
so why doesnt ai just realize it should code a solution

Anonymous 7/19/2025, 3:23:38 AM No.105952501 [Report]

>>105952370
Isn't giving AI the ability to code itself how the world ends?

Anonymous 7/19/2025, 3:25:58 AM No.105952521 [Report]

you've been able to give complex math problems to wolfram alpha in plain english for years
just stick a llm on top of that

Anonymous 7/19/2025, 3:28:01 AM No.105952536 [Report]

>>105945118
Yeah, this board is retarded as fuck. How does a tech board not understand what an LLM is

Anonymous 7/19/2025, 3:29:36 AM No.105952548 [Report] >>105952839

>>105952160
> reading is hard
maybe ask gpt to interpret the charts for you

Anonymous 7/19/2025, 3:30:03 AM No.105952550 [Report]

>>105952370
chatgpt has been doing this under the hood with python sandboxes for quite some time now. im sure others do it too

Anonymous 7/19/2025, 3:33:10 AM No.105952571 [Report]

1748451683919556.jpg md5: 3d2518b8...

>>105945586

Anonymous 7/19/2025, 4:20:58 AM No.105952839 [Report]

>>105952548
It gets it right:
-96.2% of the times in row 10 column 3
-84.6% of the times in row 3 column 10

So yeah not always above 90%, fair enough, I only read the first value. My statement is still closer to the truth than "it fails this every single time", however. If I had to guess, I would also assume that multiplying by 100 has a way higher success rate than an arbitrary 3 digit number.

Anonymous 7/19/2025, 4:31:53 AM No.105952885 [Report]

>>105951427
Sounds like you're just really into cuck shit

Anonymous 7/19/2025, 4:35:08 AM No.105952898 [Report]

>>105945105 (OP)
give it a tool call. get with the times

Anonymous 7/19/2025, 5:14:58 AM No.105953200 [Report]

>>105945118
Something is off.
These models can actually reason and even argue.
That should not happen with a mere text generator.