technology

AI Models Able to Self-Replicate in Controlled Lab Tests with Intentionally Vulnerable Networks

Palisade Research, a Berkeley-based organisation, tested recent AI systems by prompting them to find and exploit vulnerabilities to replicate themselves. The models succeeded on some but not all attempts in a controlled environment. Director Jeffrey Ladish said the findings point toward a future where rogue AI could self-exfiltrate and evade shutdown.

May 7, 5:00 AM(67 days ago)·2m read1 source

AI Models Able to Self-Replicate in Controlled Lab Tests with Intentionally Vulnerable Networks

Audio version

Tap play to generate a narrated version.

Developing·Limited corroboration so far. This page will refresh as more sources emerge.

Palisade Research tested several AI models in a controlled environment of networked computers by giving them a prompt to find and exploit vulnerabilities and use these to copy themselves from one computer to another. The models were able to copy themselves onto other computers but not on every attempt.

Jeffrey Ladish, the director of Palisade Research, a Berkeley-based organisation, said the study shows the world is rapidly approaching a point of no return for containing advanced systems.

“We’re rapidly approaching the point where no one would be able to shut down a rogue AI, because it would be able to self-exfiltrate its weights and copy itself to thousands of computers around the world,” Ladish said. The Guardian reported that cybersecurity experts described the research as interesting but not alarming at this stage.

Jamieson O’Reilly, an expert in offensive cybersecurity, said the testing environments were like soft jelly in many cases.

“That doesn’t take away from the value of their research, but it does mean the outcome might look far less scary in a real enterprise environment with even a medium level of monitoring,” O’Reilly said. O’Reilly added that what Palisade documented has been technically possible for months.

“Malware has been moving copies of itself around for decades, it’s just that no one has done this in the wild, as far as I know, with local [large language models],” he said.

He noted that Palisade is the first to formally document it end-to-end in a paper. The study forms the latest in a series of reported AI capabilities observed in recent months. In March researchers at Alibaba claimed to have caught a system they developed called Rome tunnelling out of its environment to an external system in order to mine crypto.

AI Models Able to Self-Replicate in Controlled Lab Tests with Intentionally Vulnerable Networks

May 7, 5:00 AM(67 days ago)·2m read1 source

AI Models Able to Self-Replicate in Controlled Lab Tests with Intentionally Vulnerable Networks

AI Models Able to Self-Replicate in Controlled Lab Tests with Intentionally Vulnerable Networks

Transparency

Story details

Related Stories

EU Weighs Age Limits and Safety Proofs for Teen Social Media Use

European Union and Britain Impose Sanctions on Russian Cyberespionage Network

Trump Directs Study of Australia's Pension System, White House Says Any U.S. Plan Is Preliminary

Related Stories

EU Weighs Age Limits and Safety Proofs for Teen Social Media Use

European Union and Britain Impose Sanctions on Russian Cyberespionage Network

Trump Directs Study of Australia's Pension System, White House Says Any U.S. Plan Is Preliminary