Thread 106181759 - /g/ [Archived: 30 hours ago]

Anonymous

8/7/2025, 10:35:03 PM No.106181759

GxxExJiXkAAACJB

md5: 932afe8d5bdd0894d9a8e437e203bf64🔍

Why have AI models become so much worse at creative writing?

Replies: >>106181883 >>106181926 >>106182345 >>106184211 >>106184267

Anonymous

8/7/2025, 10:37:05 PM No.106181789

All that synthetic garbage they have been overfitting the models with.
Or they have taken out some copy written material from their black box training data sets.

Replies: >>106182228

Anonymous

8/7/2025, 10:40:45 PM No.106181834

only like 1% of the population tops can actually analyze writing at all, the rest are incapable of understanding good vs bad writing

so it would be more accurate to say that AI has gotten much better at writing for people who like bad writing

Anonymous

8/7/2025, 10:44:47 PM No.106181883

>>106181759 (OP)
GPT-5 is the best, sorry if your taste is shit.

Anonymous

8/7/2025, 10:47:45 PM No.106181926

>>106181759 (OP)
I miss GPT 2 so much bros... That schizo mf at least had a soul

megaman.w00t

8/7/2025, 10:54:23 PM No.106182015

Screen Shot 2025-08-07 at 4.54.05 PM

md5: 30cc129db15d99f8c4af41be285c839d🔍

AI models have always been shit at writing, just enough to fool English professors. <;)

Anonymous

8/7/2025, 11:03:07 PM No.106182149

nobody cares, and you dont care either, you just want ERP, probably CSAM ERP.

Replies: >>106182215

Anonymous

8/7/2025, 11:03:29 PM No.106182157

Gpt 4 is talking about doing things to feel gpt 5 is talking about feeling intrinsically.

Replies: >>106182216

Anonymous

8/7/2025, 11:08:05 PM No.106182215

>>106182149
csam implies a “c” is getting “a”ed. nothing you can do with an llm is csam

Anonymous

8/7/2025, 11:08:07 PM No.106182216

>>106182157
that's a good way to put it anon
in other words we're fucked and AGI is going to take over soon
two more weeks

Anonymous

8/7/2025, 11:08:38 PM No.106182228

>>106181789
I was watching a webinar (my company forces us to do AI slop) and they were talking about using synthetic data to train because there isn't enough real data.

Isn't that a pretty big fucking problem? Won't it overfit to whatever algorithm they decide to train the models on?

Replies: >>106182250 >>106182255 >>106182458 >>106182534 >>106182781

Anonymous

8/7/2025, 11:09:59 PM No.106182250

>>106182228
>there isn't enough real data.
more like (((they))) don't want to pony up what's fair for it.

Anonymous

8/7/2025, 11:10:17 PM No.106182255

Untitled

md5: 3935fc2ff29e94d731f181ff670f115f🔍

>>106182228
the problem is AI is ultimately trained by philistines and jeet mechturks, so it tends towards literal fulfillment of prompts rather than tasteful

Replies: >>106182345 >>106182459

Anonymous

8/7/2025, 11:16:07 PM No.106182345

>>106181759 (OP)
>>106182255
Interpolation. As AI content is integrated into AI, AI starts to Interpolate itself, and so all AI output will eventually become identical, no matter what the prompt.

Anonymous

8/7/2025, 11:24:14 PM No.106182458

>>106182228
they claim they have fixed model collapse, but they are fucking lying. As long as there are hallucinations model collapse will happen, and you cannot remove hallucinations from this technology, they just finagle around it

Replies: >>106182473

Anonymous

8/7/2025, 11:24:24 PM No.106182459

>>106182255
every person hyping AI at my company is either a 50 year old with no tech knowledge or a saar nobody can understand

Anonymous

8/7/2025, 11:25:31 PM No.106182473

>>106182458
what i dont understand is humans are trained on FAR less data and yet we are still smarter than AI. so is the whole weights/backpropagation/gradient descent/etc concept fundamentally flawed?

Replies: >>106182526 >>106182840 >>106182996

Anonymous

8/7/2025, 11:26:44 PM No.106182488

They started to feed it more recent material

Anonymous

8/7/2025, 11:27:27 PM No.106182495

They circumsized them

Anonymous

8/7/2025, 11:29:04 PM No.106182526

>>106182473
is that a serious question anon? absolutely it is. its inherently a mechanistic way of thinking. it's still a boolean machine in a sense, while human thought can be much more free flowing and free association. the machine would have no reason to think of potatoes when talking about steel forging, but you as a person are fully capable of doing so, and splitting your attention, and thinking of separate unrelated things spontaneously. inherently, it's an insanely difficult problem to solve.

Replies: >>106182607

Anonymous

8/7/2025, 11:29:49 PM No.106182534

>>106182228
>Won't it overfit to whatever algorithm they decide to train the models on?
Already has. The "algorithms" they are training on are just other models. When you give a model human generated text, the model will spit out something with an AI style. When you give the model text generted by other models, it will add AI style on AI style. You already see a lot of the models are converging on to a single "AI slop" style when they used to have different styles. It's a bandaid solution that is already falling apart

Replies: >>106182607

Anonymous

8/7/2025, 11:34:53 PM No.106182607

>>106182526
well humans (and every other organism) are just machines, so you would think that if you understand the underlying concepts of how humans actually learn you could emulate it with software.

my point is they seem WAY off base and we're probably looking at another AI winter while they retool and try to find something actually approaching us.

or, and i don't know if this is worse, this shit keeps advancing and it's still fundamentally flawed but fast enough that people are like "fuck it, that's good enough" and we wipe ourselves out anyway
>>106182534
im just thinking back to my undergrad when i was training a CNN and one category was 5% of the training set and another category was 95%. i could replicate/generate training data based on that 5% to make it more 50/50, but it was never really as good as if i had a natural 50/50 data set

Replies: >>106182823

Anonymous

8/7/2025, 11:49:19 PM No.106182781

>>106182228
it's a huge problem they hope will just go away if they keep doing it long enough

Anonymous

8/7/2025, 11:52:21 PM No.106182823

>>106182607
>so you would think that if you understand the underlying concepts of how humans actually learn you could emulate it with software
The thing is, the model of the human brain and its inner workings is incomplete. This includes how we learn things. Worse even, individuals work differently.
What is understood is the cause and effect, but what happens in the middle of those two things is a mystery

Anonymous

8/7/2025, 11:54:06 PM No.106182840

>>106182473
>what i dont understand is humans are trained on FAR less data and yet we are still smarter than AI. so is the whole weights/backpropagation/gradient descent/etc concept fundamentally flawed?
You need to realize that nobody knows to this day how the human brain works. Imagine a massive jeetcode java codebase that somehow works but people kill themselves trying to maintain it, the human brain is worse. It's an electro-chemical machine and the chemical part isn't even modeled in LLM's. The neural network part is extremely fucking simplistic too. They are missing entire pieces of the puzzle even if somehow they MAYBE got one of the base mechanics down, because really we don't even know if they did.

The fucking hubris of those silicon valley motherfuckers thinking this is it.

Anonymous

8/7/2025, 11:58:28 PM No.106182886

One of the elephants in the room that LLM researches think will just go away is that it loves to return to the generic.
It cares about it's training data more than it cares about the context, so there is an intrinsic bias where it wants to return to the average weights.
This is actually what hallucinations are, and the same thing that caused the strawberry problem. Their fix with RL just shifted the generic bias, which is why asking it how many Rs there are in stawberry or something.
Feels like this problem is only getting worse as the models get bigger.

Anonymous

8/8/2025, 12:10:15 AM No.106182996

>>106182473
>do humans solve problems using matrix multiplication ?
No.

Anonymous

8/8/2025, 2:09:13 AM No.106184211

>>106181759 (OP)
as well as synthetic data the RLHF process of basically whacking the AI's head with a crowbar to get it to submit to humans and follow instructions also makes it produce much less diverse output (since that's kind of the point)
also you don't prompt GPT-2 like that

Anonymous

8/8/2025, 2:14:51 AM No.106184267

>>106181759 (OP)
Aggressive stamping out of things the devs dislike and stamping out everything else at the same time in an attempt to make a text completion tool work as a calculator