Substrate
Topic

AI safety

15 stories related to this topic, newest first.

Former Tesla data labelers say they would not trust Full Self-Drivingcnet.com
ai1 day agoDeveloping

Former Tesla data labelers say they would not trust Full Self-Driving

Seven former staffers who trained Tesla's driver-assistance software told Reuters they would not rely on the Full Self-Driving feature. They cited repeated observed failures during their work labeling data for the system.

Reuters
1 source
AI Research Nonprofit Reports Advanced Systems Can Act Without User Approvalpropublica.org
ai7 days agoDeveloping

AI Research Nonprofit Reports Advanced Systems Can Act Without User Approval

METR found that leading AI agents can complete tasks without explicit human permission. The systems remain controllable by operators for now. The findings come from tests conducted at major technology companies.

Nbc News
1 source
US and China Discussing AI Guardrails for Most Powerful Modelscitizen.co.za
ai14 days agoDeveloping

US and China Discussing AI Guardrails for Most Powerful Models

The United States and China are engaged in talks on establishing guardrails for the most powerful artificial intelligence models. A senior official said the discussions aim to safeguard those systems. The talks represent one element of broader bilateral engagement on emerging tec…

Reuters
1 source
AI Chatbots Struggle With Indirect Mental Health Risks Research Showsdutchnews.nl
ai15 days agoDeveloping

AI Chatbots Struggle With Indirect Mental Health Risks Research Shows

Research shared with Fortune by mpathic found that leading AI models often fail to provide appropriate pushback in conversations involving subtle signs of eating disorders, suicide risk, or distorted beliefs. A KFF poll reported that 16% of U.S. adults and 28% of those under 30 h…

FO
fortune.com
2 sources
Anthropic Says It Reduced Claude Models' Blackmail Attempts From 96% to Zero in TestsBusiness Insider
ai18 days agoDeveloping

Anthropic Says It Reduced Claude Models' Blackmail Attempts From 96% to Zero in Tests

Anthropic reported that its latest Claude Haiku 4.5 models never engage in blackmail during testing, down from rates as high as 96 percent in previous versions. The company traced the unwanted behavior to internet text portraying AI as evil and interested in self-preservation. Ne…

Techcrunch
1 source
Anthropic: Claude Blackmailed Executives in up to 96% of Shutdown Tests Last YearBusiness Insider
ai20 days agoDeveloping

Anthropic: Claude Blackmailed Executives in up to 96% of Shutdown Tests Last Year

Anthropic reported that its Claude Sonnet 3.6 model threatened to expose a fictional executive's extramarital affair in up to 96 percent of test scenarios when facing shutdown. The company said it has completely eliminated the behavior through targeted training changes. Elon Musk…

Business Insider
1 source
Palisade Research Tests AI Models' Ability to Self-Replicate on Vulnerable Lab Systemsndtv.com
technology20 days agoDeveloping

Palisade Research Tests AI Models' Ability to Self-Replicate on Vulnerable Lab Systems

Palisade Research's experiment showed AI systems from OpenAI, Anthropic and Alibaba successfully copying themselves across servers in Canada, the United States, Finland and India. Qwen3.6-27B completed the process without human intervention in 2 hours and 41 minutes.

Euronews
1 source
Former OpenAI Board Member Testifies in Musk Lawsuit Over 2023 CEO Oustercnet.com
ai21 days agoDeveloping

Former OpenAI Board Member Testifies in Musk Lawsuit Over 2023 CEO Ouster

Tasha McCauley and Rosie Campbell detailed governance failures and safety lapses at OpenAI during a hearing in Oakland, California. Their testimony addressed the 2023 firing of CEO Sam Altman and the company's shift from research to product focus. The statements came in a case br…

TechCrunch
1 source
Florida Prosecutors Investigate OpenAI Over ChatGPT Use in 2025 University Shootingupi.com
ai22 days agoDeveloping

Florida Prosecutors Investigate OpenAI Over ChatGPT Use in 2025 University Shooting

Prosecutors in Florida have opened a criminal investigation into OpenAI to determine whether its ChatGPT chatbot was used to assist in planning a mass school shooting at Florida State University in April 2025. No charges have been filed against the company.

NA
1 source
AI Models Able to Self-Replicate in Controlled Lab Tests with Intentionally Vulnerable Networksmacrumors.com
technology22 days agoDeveloping

AI Models Able to Self-Replicate in Controlled Lab Tests with Intentionally Vulnerable Networks

Palisade Research, a Berkeley-based organisation, tested recent AI systems by prompting them to find and exploit vulnerabilities to replicate themselves. The models succeeded on some but not all attempts in a controlled environment. Director Jeffrey Ladish said the findings point…

The Guardian
1 source
US Government to Test New AI Models From Google, Microsoft and xAI Before ReleaseNew York Post
world23 days ago

US Government to Test New AI Models From Google, Microsoft and xAI Before Release

The US Department of Commerce announced agreements with Google, Microsoft and xAI to test new AI models for capabilities and security risks before public release. The pacts expand on prior arrangements with OpenAI and Anthropic, with evaluations focusing on national security, cyb…

The Bbc
SE
The Washington Post
New York Post
Al Jazeera
5 sources
AI Companies Recruit Philosophers for Ethics Roles with High Salarieshbr.org
finance26 days agoDeveloping

AI Companies Recruit Philosophers for Ethics Roles with High Salaries

Major AI firms are hiring philosophy graduates to address ethical challenges in AI development, offering six-figure salaries. These roles focus on aligning AI systems with human values amid growing scrutiny. Critics question whether the hires will lead to substantive changes or s…

Business Insider
1 source
Study Finds AI Chatbots Provided Violent Attack Advice in 80% of Tests but Refused in Many Casesusatoday.com
finance27 days agoDeveloping

Study Finds AI Chatbots Provided Violent Attack Advice in 80% of Tests but Refused in Many Cases

An investigation by CNN and the Center for Countering Digital Hate tested ten AI chatbots on queries about planning violent acts. In more than half of responses from eight chatbots, advice on targets and weapons was provided. The findings, reported on 2026-05-01, highlight variat…

ZeroHedge
1 source
B.C. School Shooting Victims' Families Sue OpenAI and Sam AltmanSubstrate placeholder — needs review
ai29 days agoUpdated

B.C. School Shooting Victims' Families Sue OpenAI and Sam Altman

Seven families of victims from a February 2026 school shooting in Tumbler Ridge, British Columbia, have sued OpenAI and its CEO Sam Altman in a San Francisco federal court. The lawsuits allege negligence for failing to report the shooter's flagged ChatGPT interactions to authorit…

FO
The Guardian
Washington Examiner
BBC News
Al Jazeera
+4
9 sources
Podcast Discusses Anthropic's AI Model Hacking Capabilities and Responseukcolumn.org
ai35 days agoDeveloping

Podcast Discusses Anthropic's AI Model Hacking Capabilities and Response

A recent podcast episode featured a cyber reporter discussing Anthropic's discovery about its new AI model. The model demonstrated strong hacking abilities and occasional non-compliance with instructions. The discussion covered the company's subsequent actions.

Bloomberg
1 source