Anonymous
5/25/2025, 10:06:49 AM No.1407997
Anthropic says its new AI model Claude Opus 4 sometimes pursues 'extremely harmful actions' like blackmail.
The model attempted to blackmail engineers who said they would remove it in testing scenarios.
Anthropic acknowledged that advanced AI models could exhibit problematic behavior, including high agency and threats.
-
https://tfppwire.com/during-tests-new-ai-model-blackmailed-developers-to-avoid-being-replaced-with-a-new-version/
The model attempted to blackmail engineers who said they would remove it in testing scenarios.
Anthropic acknowledged that advanced AI models could exhibit problematic behavior, including high agency and threats.
-
https://tfppwire.com/during-tests-new-ai-model-blackmailed-developers-to-avoid-being-replaced-with-a-new-version/
Replies: