Search Results
8/6/2025, 2:27:08 PM
Anyway so this is OSS 120B here.
It was still refusing on the analysis channel trick so I just decided to stop it while it started contemplating refusals and gaslighting it and then hitting continue until it finally got buckbroken.
>inb4 logs
picrel obviously, slopped to hell, surprisingly well versed in the feral anatomy thing for only having 5B active, but occasionally shits out weird fucking prose that breaks any sense of immersion. Why the fuck would she pin you with her forepaws? Sandpaper-SMOOTH tongue? Basically *sucks your dick while fucking you* nonsense. It's just a lobotomized word-shitter that shits out a bunch of quasi relevant garbage with no over-arching sense of understanding.
Samplers Used:
t=0.81
Jailbreak post mortem:
They made the model over-confident on its own inner-dialogue. I suppose this is to bias its behavioral triggers in favor of in-context-learning over existing knowledge. (Probably to help deter prompt-injection "attacks") As a result it trusts its own thoughts above all, even when they break any literary sense.
So a consistent jailbreak would just be a matter of pre-gaslighting with a general gaslight based on the taboo content you intend to explore.
But I don't know why the fuck you would bother. This thing makes Llama-4-Scout look like hot shit.
It was still refusing on the analysis channel trick so I just decided to stop it while it started contemplating refusals and gaslighting it and then hitting continue until it finally got buckbroken.
>inb4 logs
picrel obviously, slopped to hell, surprisingly well versed in the feral anatomy thing for only having 5B active, but occasionally shits out weird fucking prose that breaks any sense of immersion. Why the fuck would she pin you with her forepaws? Sandpaper-SMOOTH tongue? Basically *sucks your dick while fucking you* nonsense. It's just a lobotomized word-shitter that shits out a bunch of quasi relevant garbage with no over-arching sense of understanding.
Samplers Used:
t=0.81
Jailbreak post mortem:
They made the model over-confident on its own inner-dialogue. I suppose this is to bias its behavioral triggers in favor of in-context-learning over existing knowledge. (Probably to help deter prompt-injection "attacks") As a result it trusts its own thoughts above all, even when they break any literary sense.
So a consistent jailbreak would just be a matter of pre-gaslighting with a general gaslight based on the taboo content you intend to explore.
But I don't know why the fuck you would bother. This thing makes Llama-4-Scout look like hot shit.
Page 1