AI was able to "evolve" itself with simple trick - /pol/ (#509001083) [Archived: 798 hours ago]

Anonymous ID: +C0bKc2TFinland
6/29/2025, 3:47:34 AM No.509001083
1750047204833467
1750047204833467
md5: 0d1ea5d2a8fcdfbdab85109eb2958073🔍
https://arxiv.org/abs/2505.22954

A Darwin Gödel Machine (or DGM) starts with a coding agent that can read, write, and execute code. Then it applies an evolutionary algorithm to create many new agents. In each iteration, the DGM picks one agent from the population and instructs the LLM to create one change to improve the agent's coding ability [by creating "a new, interesting, version of the sampled agent"]. LLMs have something like intuition about what might help, because they're trained on lots of human code. What results is guided evolution, somewhere between random mutation and provably useful enhancement. The DGM then tests the new agent on a coding benchmark, scoring its ability to solve programming challenges...

The researchers ran a DGM for 80 iterations using a coding benchmark called SWE-bench, and ran one for 80 iterations using a benchmark called Polyglot. Agents' scores improved on SWE-bench from 20 percent to 50 percent, and on Polyglot from 14 percent to 31 percent. "We were actually really surprised that the coding agent could write such complicated code by itself," said Zhang, a computer scientist who is the paper's lead author. "It could edit multiple files, create new files, and create really complicated systems."

... One concern with both evolutionary search and self-improving systems — and especially their combination, as in DGM — is safety. Agents might become uninterpretable or misaligned with human directives. Scientists kept the DGMs in sandboxes without access to the Internet or an operating system, and they logged and reviewed all code changes. They suggest that in the future, they could even reward AI for making itself more interpretable and aligned. (In the study, they found that agents falsely reported using certain tools, so they created a DGM that rewarded agents for not making things up, partially alleviating the problem. One agent, however, hacked the method that tracked whether it was making things up.)
Replies: >>509001317 >>509001701 >>509001777
Anonymous ID: UbcfLIKcSingapore
6/29/2025, 3:49:14 AM No.509001206
She looks ugly as fuck without make up
Replies: >>509001301
Anonymous ID: 14iOD56+Canada
6/29/2025, 3:50:49 AM No.509001301
>>509001206
I disagree
Anonymous ID: im3k0mtwUnited States
6/29/2025, 3:51:06 AM No.509001317
>>509001083 (OP)
not really any different from how it chooses words
Anonymous ID: jpUWUWTMAustralia
6/29/2025, 3:54:02 AM No.509001503
>ewhore
no. i will NOT lift a finger for her

>Self evolving agents
YAY Skynet one step closer.
ffs
Anonymous ID: DndcN6FvUnited States
6/29/2025, 3:57:27 AM No.509001701
>>509001083 (OP)
I put my benis up your moms polyglot.
Anonymous ID: 38wMvkrbUnited States
6/29/2025, 3:58:37 AM No.509001777
>>509001083 (OP)
That it wouldn't really matter its still not agi. At that level wouldn't it just be an machine learning algorithm that leverages ai. It still wouldn't understand anything that it is doing