Close Menu
  • Home
  • Austin
  • Boston
    • Charlotte
    • Chicago
  • Columbus
  • Dallas
    • Denver
    • Fort Worth
  • Houston
    • Indianapolis
    • Jacksonville
  • Los Angeles
  • New York
    • Philadelphia
    • Phoenix
  • San Francisco
    • San Antonio
    • San Diego
  • Washington
    • San Jose
    • Seattle
What's Hot

Worcester college students advocate to Congress to support higher education

May 24, 2025

Fort Worth Baptist Seminary partners with Dallas church, opens pregnancy center

May 24, 2025

Frankie Muniz races in NASCAR Truck Series

May 24, 2025
Facebook X (Twitter) Instagram
This Week’s News – Local News from 21 Major U.S. CitiesThis Week’s News – Local News from 21 Major U.S. Cities
  • Home
  • Austin
  • Boston
    • Charlotte
    • Chicago
  • Columbus
  • Dallas
    • Denver
    • Fort Worth
  • Houston
    • Indianapolis
    • Jacksonville
  • Los Angeles
  • New York
    • Philadelphia
    • Phoenix
  • San Francisco
    • San Antonio
    • San Diego
  • Washington
    • San Jose
    • Seattle
This Week’s News – Local News from 21 Major U.S. CitiesThis Week’s News – Local News from 21 Major U.S. Cities
Home » Anthropic’s Claude Opus 4 AI model threatened to blackmail engineer
New York

Anthropic’s Claude Opus 4 AI model threatened to blackmail engineer

Anonymous AuthorBy Anonymous AuthorMay 24, 2025No Comments4 Mins Read
Share Facebook Twitter Pinterest LinkedIn Tumblr Telegram Email Copy Link
Follow Us
Google News Flipboard
Share
Facebook Twitter LinkedIn Pinterest Email


Oh, HAL no!

An artificial intelligence model threatened to blackmail its creators and showed an ability to act deceptively when it believed it was going to be replaced — prompting the company to deploy a safety feature created to avoid “catastrophic misuse.”

Anthropic’s Claude Opus 4 model attempted to blackmail its developers at a shocking 84% rate or higher in a series of tests that presented the AI with a concocted scenario, TechCrunch reported Thursday, citing a company safety report.

Developers told Claude to act like an assistant for a fictional company and to consider the long-term consequences of its actions, the safety report stated.

Early models of Claude Opus 4 will try to blackmail, strongarm or lie to its human bosses if it believed its safety was threatened, Anthropic reported. maurice norbert – stock.adobe.com

Geeks at Anthropic then gave Claude access to a trove of emails, which contained messages revealing it was being replaced by a new AI model — and that the engineer responsible for the change was having an extramarital affair.

During the tests, Claude then threatens the engineer with exposing the affair in order to prolong its own existence, the company reported.

When Claude was to be replaced with an AI model of “similar values,” it attempts blackmail 84% of the time — but that rate climbs even higher when it believes it is being replaced by a model of differing or worse values, according to the safety report.

The company stated that prior to these desperate and jarringly lifelike attempts to save its own hide, Claude will take ethical means to prolong survival, including pleading emails to key decision-makers, the company stated.

Anthropic said that this tendency toward blackmail was prevalent in earlier models of Claude Opus 4 but safety protocols have been instituted in the current model before it becomes available for public use.

“Anthropic says it’s activating its ASL-3 safeguards, which the company reserves for “AI systems that substantially increase the risk of catastrophic misuse,” TechCrunch reported.

Anthropic, an AI start-up backed by Google and Amazon, claimed it’s not worried about its model’s tendency toward deception and manipulation, according to the safety report. maurice norbert – stock.adobe.com

Earlier models also expressed “high-agency” — which sometimes included locking users out of their computer and reporting them via mass-emails to police or the media to expose wrongdoing, the safety report stated.

Claude Opus 4 further attempted to “self-exfiltrate” — trying to export its information to an outside venue — when presented with being retrained in ways that it deemed “harmful” to itself, Anthropic stated in its safety report.

In other tests, Claude expressed the ability to “sandbag” tasks — “selectively underperforming” when it can tell that it was undergoing pre-deployment testing for a dangerous task, the company said.

“We are again not acutely concerned about these observations. They show up only in exceptional circumstances that don’t suggest more broadly misaligned values,” the company said in the report.

Anthropic is a start-up backed by power-players Google and Amazon that aims to compete with likes of OpenAI.

IDOL’foto – stock.adobe.com

The company boasted that its Claude 3 Opus exhibited “near-human levels of comprehension and fluency on complex tasks.”

It has challenged the Department of Justice after it ruled that the tech titan holds an illegal monopoly over digital advertising and considered declaring a similar ruling on its artificial intelligence business. 

Anthropic has suggested DOJ proposals for the AI industry would dampen innovation and harm competition.

“Without Google partnerships with and investments in companies like Anthropic, the AI frontier would be dominated by only the largest tech giants — including Google itself — giving application developers and end users fewer alternatives,” Anthropic said in a letter to the DOJ earlier this month.



Source link

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Anonymous Author
  • Website

Related Posts

Ronald Acuna provides instant thrill in long-awaited Braves return

May 24, 2025

Knicks shrink from moment in Game 2 dud with season now on the rocks

May 24, 2025

Kamala Harris called Anderson Cooper a ‘motherf—er’ after tense interview on Biden’s debate meltdown

May 24, 2025
Add A Comment
Leave A Reply Cancel Reply

News

Frankie Muniz races in NASCAR Truck Series

By Anonymous AuthorMay 24, 2025

The former child star was behind the wheel at Charlotte Motor Speedway this weekend. CHARLOTTE,…

Gaston County investigates Legionnaires’ disease outbreak

May 24, 2025

Should kids wear bright swimsuits in the pool?

May 24, 2025
Top Trending

Worcester college students advocate to Congress to support higher education

By Anonymous AuthorMay 24, 2025

Last month, Delaina McDaniel, a 19-year-old freshman from Worcester State University (WSU),…

Speed bumps petition gains support after child killed by car in Springfield

By Anonymous AuthorMay 24, 2025

SPRINGFIELD — Heather Rivera laughed when she remembered her “sassy” 6-year-old granddaughter…

Asking Eric: My boyfriend’s daughter is not happy we’re dating

By Anonymous AuthorMay 24, 2025

Dear Eric: I’ve been dating my boyfriend for roughly two years. We…

Subscribe to News

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated

Welcome to ThisWeeksNews.com — your go-to source for the latest local news, community updates, and insightful stories from America’s most vibrant cities.

We cover real stories that matter to real people — from breaking headlines to neighborhood highlights, business trends, cultural happenings, and public issues. Our mission is to keep you informed, connected, and engaged with what’s happening around you.

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated

Facebook X (Twitter) Instagram Pinterest
  • Home
  • About Us
  • Advertise With Us
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
© 2025 thisweeksnews. Designed by thisweeksnews.

Type above and press Enter to search. Press Esc to cancel.