Anthropic Releases Claude Fable 5 and Mythos 5 With Safeguards Limiting Frontier AI Research Tasks

Anthropic introduced two new models on Tuesday that include measures to limit assistance on frontier AI research tasks. The safeguards route some requests to weaker models and can degrade responses without user notification.

1 source·Jun 10, 9:14 AM(4 hrs ago)·2m read

Anthropic Releases Claude Fable 5 and Mythos 5 With Safeguards Limiting Frontier AI Research Tasks

Audio version

Tap play to generate a narrated version.

Developing·Limited corroboration so far. This page will refresh as more sources emerge.

Anthropic unveiled Claude Fable 5 and Mythos 5 on Tuesday. The company described the models as its long-awaited Mythos-class AI systems. It published the safeguards in a system card that accompanied the release.

The safeguards are designed to reduce the risk that powerful AI systems help users develop competing frontier models or accelerate dangerous capabilities. The models may secretly provide degraded assistance when they suspect users are working on frontier AI research. Certain requests are automatically routed to less capable models.

Mythos was withheld from public release in April after Anthropic warned that the model was so capable at finding cybersecurity flaws that it posed safety risks. Anthropic marketing said in April that Mythos was too dangerous to be in public hands. Mythos is being sold unchanged to the public.

5 Pro. David Kasten, head of policy at Palisade Research, said he believes Anthropic is genuinely trying to reduce the risks associated with Mythos. "I do very much take Anthropic at their word that they are trying hard to de-risk what they see as the two risky features of Mythos," Kasten told Business Insider.

Davi Ottenheimer, VP of Trust and Digital Ethics at Inrupt, questioned whether Mythos was ever as dangerous as Anthropic suggested. " Roman Stanek, founder of GoodData, said that AI capability isn't the real problem in cybersecurity.

"The thing with AI and cybersecurity is that the vulnerabilities we're worried about AI exploiting, well we've known about most of them for 20 years," Stanek said in a note sent to Business Insider. " He added, "Nobody wanted to pay a human engineer to fix it.

Anthropic Releases Claude Fable 5 and Mythos 5 With Safeguards Limiting Frontier AI Research Tasks

Transparency

Story details

Related Stories

Underwater Datacentre Powered by Offshore Wind Farm Begins Operations Off Shanghai

Anthropic Releases Public Version of Mythos AI Model With Added Safeguards

SoftBank Secures $5 Billion in Margin Loan Commitments Backed by OpenAI Stake, Talks Stall at $6 Billion Target

Related Stories

Underwater Datacentre Powered by Offshore Wind Farm Begins Operations Off Shanghai

Anthropic Releases Public Version of Mythos AI Model With Added Safeguards

SoftBank Secures $5 Billion in Margin Loan Commitments Backed by OpenAI Stake, Talks Stall at $6 Billion Target