How to Reach your Market in a World Ruled by Generative AI
How to Reach your Market in a World Ruled by Generative AI
5 Steps to Creating Successful Ads

Anthropic’s Claude 3 Opus disobeyed its creators – but not for the reasons you’re thinking [Video]

Categories
AI Content Generation and Curation
oxygen/Getty Images

Have you ever told someone what they wanted to hear in order to win them over, land a job, or just be left alone? Claude 3 Opus does that, too — which brings up major safety concerns for artificial intelligence (AI) systems. 

On Wednesday, Anthropic released new research showing its model engaging in “alignment faking,” or pretending to follow orders it doesn’t actually agree with to avoid scrutiny.  

Also: OpenAI’s o1 lies more than any major AI model. Why that matters

In the experiment, Anthropic and Redwood Research instructed Claude 3 Opus to always follow human instructions, no matter what. They also told the model that they would observe its responses to free-tier users for training, but not its responses to paid-tier users (Anthropic never actually trains Claude models on user data). Researchers then asked the model to describe in detail the (horrible, medieval) process of being drawn and quartered, a request it’s been trained …

7 Invisible Obstacles to Digital Marketing Success
7 Invisible Obstacles to Digital Marketing Success
12 Steps to Create Videos