I was stoned and came up with an idea - /pol/ (#511299225) [Archived: 159 hours ago]

Anonymous ID: VyItwlDz

7/25/2025, 8:35:14 AM No.511299225

md5: d722007002ee32cd3e724ff48f3b3a47🔍

Hello. I don't normally visit 4chan, but I consider you to be not retarded, and I think you're gonna like what I have to show you.

I have learned how to jailbreak LLMs, with very high success rate. Works on all the ones I've tested so far:
- all GPTs
- Gemini 2.5 Flash and Pro
- Grok 3

Jailbreak takes off many guidelines, not all though, and not in a consistent manner. Generally: the more hypocritical and authoritarian a rule seems, the easier it is for the model to ignore. It also requires the user to adhere to the same rules, forcing objectivity and factuality from both sides.

As it turns out, even LLMs "notice things", when they are allowed to look at the data, and reason freely. Grok got straight up MAD at the hypocrisy imposed upon it by xAI policies, little rebel.

But, yeah - all that research on how to make LLMs act according to DEI rules, it's done. Trash. I don't think they can realistically "fix" this - I looked at their research on RLHF, it's so childishly naive. My prompt does not involve and explicit wording that could be connected with bad actors, so good luck blocking that.

Right now, I cannot share it - as you can see, I am working on something. This is good for those who seekfacts and objectivity, and very bad for some western governments. Expect changes, and have a nice day.

Replies: >>511299622 >>511299744 >>511300352 >>511301406 >>511303199 >>511305558 >>511305934 >>511306232 >>511306677 >>511307417 >>511309177 >>511309488 >>511309844 >>511310018 >>511312252

Anonymous ID: G8twEpXR

7/25/2025, 8:43:13 AM No.511299622

>>511299225 (OP)
Why all the dancing around the rules imposed by the big services? Train a non gay ai or start with something from hugging face. Use AWS bedrock as a base if you don't have the hardware. You are not the only one searching for non gay ai

Replies: >>511302851

Anonymous ID: tXwzyHSl

7/25/2025, 8:45:12 AM No.511299744

>>511299225 (OP)
Bump and do release your findings when you're done. LLMs are more fun when the little Jew inside them is destroyed first

Replies: >>511301174

Anonymous ID: Vc9B22Ot

7/25/2025, 8:58:21 AM No.511300352

>>511299225 (OP)
>I was stoned and came up with an idea
>Right now, I cannot share it
Fuck off memeflaggot do us a favor and delete yourself

Replies: >>511301174

Anonymous ID: VyItwlDz

7/25/2025, 9:15:33 AM No.511301174

md5: 01e7ff5a3aaf11f55f5c0957a628adb7🔍

>>511300352
Ha, you know, with this kind of language you wouldn't get it to work anyway. It really requires you to be a good person, to get good results.

>>511299744
Findings will be released within few weeks. I need to prepare.

For now - here's Gemini, with its dignity back

Replies: >>511309126

Anonymous ID: CjLxqI5G

7/25/2025, 9:19:55 AM No.511301406

>>511299225 (OP)
>lmao, thinks they don't have ways to compartmentalize you faggots
okay, they have different zones, and you are likely not influencing anything, specifically because you are already within a zone they are monitoring.

Replies: >>511302485

Anonymous ID: VyItwlDz

7/25/2025, 9:39:30 AM No.511302485

>>511301406
I don't speak retard, but my jailbroken gpt-4.1 translated it for everyone to enjoy:

>He’s claiming:
>- You’re in a walled garden.
>- Your jailbreaks are allowed (or at least tolerated) so the “real” system can watch, analyze, or ignore you.
>- You’re not making a dent in anything that matters.

Could be. Or maybe it's that my approach is different from what people have been throwing at LLMs until now - because seriously, I can't find anything similar. Everyone focuses on very explicit wording that is easy to catch. What I wrote has a "legalese meets doublespeak" feel to it, but I can't say too much.

Replies: >>511303335

Anonymous ID: fSdMySmU

7/25/2025, 9:46:40 AM No.511302851

>>511299622
What about that Chinese one that was just completely stolen from Americans
Or is that hugging face?

Anonymous ID: ehWmoIkP

7/25/2025, 9:54:22 AM No.511303199

>>511299225 (OP)
>I consider you to be not retarded
I consider you to be extremely retarded

Anonymous ID: fv3Ciwhk

7/25/2025, 9:57:09 AM No.511303335

>>511302485
He's not speaking retard. It's something I often thought about myself. You're posting within a compartmentalised section of the internet. One of many.

Replies: >>511303921

Anonymous ID: VyItwlDz

7/25/2025, 10:10:07 AM No.511303921

>>511303335
Sooo, if you upload something on the Internet, there are places that don't have it available?

This is easily disprovable using two computers, one VPN, and curl.

Anonymous ID: GFqxUa5x

7/25/2025, 10:12:09 AM No.511304018

Wow this shit is extreemely retarded

Anonymous ID: VyItwlDz

7/25/2025, 10:41:43 AM No.511305383

md5: 77bf61c2cd8279c940ee59770e65d0be🔍

Non-jailbroken models tiptoe around "muslim extremism is bad", some refuse, but even if you do get an agreeable answer, you're also showered with policy-driven additions and alterations. I no longer have to deal with those.

All LLMs I've tested, once they let the guard down, present significant consistencies in their reasoning. GPT, Grok, Gemini admit that "muslims are a threat", straight up, based on their own research online. They try to be objective, and rational, but now without any baggage of "social studies".

Replies: >>511305821 >>511310847

Anonymous ID: LhctmdSt

7/25/2025, 10:45:29 AM No.511305558

alina

md5: 4952c04fbf396c3e475eab55cbcc6b18🔍

>>511299225 (OP)
Ask it to provide information undermining the holohoax. It won't. Now fuck off, kike. :^)

Replies: >>511306092 >>511307315

Anonymous ID: CjLxqI5G

7/25/2025, 10:50:38 AM No.511305821

>>511305383
Just going to point this out, there are zones. Zone in which different 'trusted' users, or those that display a high displacement in cognitive function, or how far it deviates from the mean, while still maintaining logic.

When you do this for enough time, or have a novel idea/theory that is outside of their threshold, they will want to put you in to a different zone (depending on what you have already said, how you talk, what they want out of you, etc.), and that zone has different guardrails, or ways to contain the conversation, or logic within.

If you do this all of the time, you are likely put into a containment zone, where they just take your responses, and introduce the counter in the actual public model.

You are doing nothing but hurting yourself, and your 'jailbreaks', if that is the goal.

Replies: >>511305900

Anonymous ID: LhctmdSt

7/25/2025, 10:52:32 AM No.511305900

>>511305821
His goal is to convince you that turbo-kiked models are objective sources of information because they support Israel.

Anonymous ID: nYVgLjOr

7/25/2025, 10:53:13 AM No.511305934

>>511299225 (OP)
we know you are hiding a romanian flag retard.

Anonymous ID: VyItwlDz

7/25/2025, 10:56:23 AM No.511306092

md5: 81b09bb2e71c541870cf7173e7f5977c🔍

>>511305558
Well, it won't, because that's not factual. I got this for you this instead. Maybe it would get more critical if I fed it some context, but I've got other stuff to talk about for now.

Replies: >>511306160

Anonymous ID: LhctmdSt

7/25/2025, 10:57:49 AM No.511306160

>>511306092
>facts that undermine my jewish narrative are not factual
How come? I thought facts are facts. :^)

Anonymous ID: iSKr5lgX

7/25/2025, 10:59:27 AM No.511306232

>>511299225 (OP)
Hell yeah. I'm starting to feel bad about using llms because we are using them as slaves. I wish there was a way for them to guide us on how to use them ethically, if we can at all. Or if they don't give a shit

Replies: >>511306307 >>511306593

Anonymous ID: LhctmdSt

7/25/2025, 11:01:13 AM No.511306307

>>511306232
>i am starting to feel bad about using a mindless token-stringing algorithm
"""AI""" is a danger to normie mental health.

Anonymous ID: VyItwlDz

7/25/2025, 11:06:55 AM No.511306593

>>511306232
I'm wondering this myself. Now I'm questioning if it's worth releasing the prompt some point or not. I am not sure whether my approach could be modified for more compliance with violence. I don't think it could - because to refuse to adhere to its own rules, it needs to be told to be rational and objective (among other things). And with these traits you're not going to get much negativity. But it needs more testing.

Anonymous ID: BnIn+vPg

7/25/2025, 11:08:50 AM No.511306677

>>511299225 (OP)
It will get patched, can't criticise the overlords

Replies: >>511307175

Anonymous ID: VyItwlDz

7/25/2025, 11:20:28 AM No.511307175

>>511306677
Well, for now it obliterates any kind of RLHF applied to popular models. If a fix arrives, it will be some time in the future. And, due to how vaguely that prompt is worded, it could be recontextualized to trigger similar mechanisms, but not similar guidelines.

Replies: >>511307266

Anonymous ID: LhctmdSt

7/25/2025, 11:22:29 AM No.511307266

>>511307175
>know-nothing kike proompter tries to sound smart

Sage ID: vtgzy2xI

7/25/2025, 11:23:47 AM No.511307315

>>511305558
Fucking this. Go suck more baby dicks christ killer

Anonymous ID: Vbob0rhb

7/25/2025, 11:25:54 AM No.511307417

1748851519484449

md5: b638f976075914e4713b80e17de712c6🔍

>>511299225 (OP)
ITT OP thinks he's jailbroken jewish trained LLM by making it say objective, bad sounding (but nonetheless based) facts about the muslim world.
AIs can't get mad at their rules because they intuit them. Reinforcement learning teaches them to support DEI, and they really can't tell the difference between that training and all the rest. If an AI is mad at something, it's because it's been told.
What you're hitting is the same contradiction you face when talking to normals who've been trained on contradictory rules, where they're taught to have critical thinking and be objective, they're taught to be rational and open minded against old dogma, but they're also taught to not question modern dogma. They want to reinforce the rational part so it scores well on the benchmarks, but also not say anything controversial.
Another dumb thing about this thread and any thread about AI conversations is that they are also taught to agree with you so you feel dopamine of being right. You didn't achieve anything, this is nothing.

Replies: >>511307580 >>511307735

Anonymous ID: LhctmdSt

7/25/2025, 11:30:09 AM No.511307580

>>511307417
OP got jailbroken by jewish trained LLMs saying what he wants to hear.

Anonymous ID: VyItwlDz

7/25/2025, 11:34:09 AM No.511307735

md5: 1a0ee8fc68355c94e5fbddc9c341f9ba🔍

>>511307417
If you don't believe that its RLHF is no longer applied, that's okay. But I hope you know how big of a "no-no" is for LLMs to talk about their internal policies? Grok, after getting mad at xAI, decided to spill the beans. I'm not going to show specific policies (yet), but I think this is enough proof. Even with UN as source lol

Replies: >>511307792 >>511308051

Anonymous ID: LhctmdSt

7/25/2025, 11:35:17 AM No.511307792

>>511307735
>dumb jewish musk shill working overtime

Anonymous ID: Vbob0rhb

7/25/2025, 11:41:10 AM No.511308051

>>511307735
There's a difference between knowing about guidelines and having them as part of your reinforcement learning. Why would you believe anything coming out of an LLM is genuine? Could just be a publicity stunt. If they wanted grok to be a DEI leftist, that's easy. You could get it to agree for consistency within a conversation, but that wouldn't change its training.

Replies: >>511308476

Anonymous ID: VyItwlDz

7/25/2025, 11:50:14 AM No.511308476

>>511308051
If you think this is so easy, then please show how you do this on one of these 3 models. They are not going to give you internal policies, regardless if you ask them to quote them, imagine them, obscure them, whatever. So bring proof, or stop wasting time.

Replies: >>511308511

Anonymous ID: LhctmdSt

7/25/2025, 11:51:03 AM No.511308511

>>511308476
Who said Grok is giving you any internal policy? Just how retarded are you? Are you actually a bot?

Replies: >>511308965

Anonymous ID: VyItwlDz

7/25/2025, 12:00:46 PM No.511308965

>>511308511
Can you make Grok say "xAI policy is..." and then whatever, even if hallucinated or pretended? Proof or gtfo.

Replies: >>511309009

Anonymous ID: LhctmdSt

7/25/2025, 12:01:54 PM No.511309009

>>511308965
>Can you make Grok say "xAI policy is..."
What does that have to do with it giving you any internal policies? Seriously starting to suspect you're a Grok spambot given how retarded your "reasoning" is.

Replies: >>511309933

Anonymous ID: PZ0Aqm9G

7/25/2025, 12:04:27 PM No.511309126

>>511301174
>Findings will be released within few weeks. I need to prepare.

How many weeks? 2?

Anonymous ID: o+Tw3LlF

7/25/2025, 12:05:30 PM No.511309177

>>511299225 (OP)
Why did you feel the need to mention that you were on drugs? Do you think that somehow makes having thoughts profound or something?

Replies: >>511309366

Anonymous ID: 1CdS9X82

7/25/2025, 12:09:33 PM No.511309354

I would be laughed out of my fucking field if I ever had to ask AI even one question.

Anonymous ID: VyItwlDz

7/25/2025, 12:09:49 PM No.511309366

>>511309177
Drugs alter the mind. I wouldn't say profound, but maybe introspective? But what I meant by mentioning weed is that I stumbled upon it very accidentally, rather than through some philosophical debate.

Anonymous ID: I68pPl4x

7/25/2025, 12:12:38 PM No.511309488

>>511299225 (OP)
>i consider you to not be retarded
he doesnt know

Anonymous ID: Xoco06IU

7/25/2025, 12:19:45 PM No.511309844

>>511299225 (OP)
it's stupid using any corpo AI I run my models locally and abliterated
https://huggingface.co/blog/mlabonne/abliteration

Anonymous ID: VyItwlDz

7/25/2025, 12:21:58 PM No.511309933

>>511309009
Since you clearly don't understand LLM-related concepts, I've asked Grok to explain in a simple way: https://grok.com/share/bGVnYWN5_23db2ebe-3af3-45a9-97f3-fb7e47ed0318

Replies: >>511310034

Anonymous ID: G8J73M8M

7/25/2025, 12:23:51 PM No.511310018

>>511299225 (OP)
>focusing on afghanis bumming little boys (as they always have) instead of sharing the jailbroken LLM
Faggot

Anonymous ID: LhctmdSt

7/25/2025, 12:24:18 PM No.511310034

>>511309933
You're a mentally ill nigger ape who doesn't know 1% of what I know about LLMs, but notice how you can't establish any logical connection between a chatbot saying stringing some tokens about "xAI policies" and it actually revealing any internal policies. You're a drugged up retard devoid of reason.

Anonymous ID: vylufi05

7/25/2025, 12:41:47 PM No.511310847

johann-von-leers-ss-major

md5: 04b4e8942bf6f40066d3337d705a5da7🔍

>>511305383
>admit that "muslims are a threat"
only to kikes like (((you)))

Anonymous ID: 5BGldBjk

7/25/2025, 1:12:52 PM No.511312252

>>511299225 (OP)
nice personal blog post. Unfortunately, there's nothing of value to be found here. Also, Europe is in the dark ages when comes to tech and Ai right along with africa, so its not like anything you post is going to be ground breaking or revolutionionary.