technology

Microsoft MDASH Outperforms Mythos Preview on CyberGym Benchmark

Microsoft announced MDASH, a multi-modal agentic scanning harness for vulnerability discovery and remediation. The system scored 88.4 percent on the CyberGym benchmark compared with 83.1 percent for Mythos Preview. MDASH uses more than 100 specialized agents and is in limited private preview with customers.

May 15, 12:52 PM(59 days ago)·2m read1 source

Microsoft MDASH Outperforms Mythos Preview on CyberGym Benchmark

Audio version

Tap play to generate a narrated version.

Developing·Limited corroboration so far. This page will refresh as more sources emerge.

Microsoft on Tuesday announced MDASH, also known as Microsoft Security multi-modal agentic scanning harness. The system is the first multi-modal service included in the CyberGym benchmark developed by UC Berkeley’s Center for Responsible, Decentralized Intelligence.

It scored 88.4 percent on the benchmark, compared with 83.1 percent for Mythos Preview. CyberGym evaluates AI agents on real-world vulnerability analysis. The benchmark includes 1,507 real-world vulnerabilities across 188 open-source projects. Microsoft said the result indicates MDASH is more effective at identifying vulnerabilities than Mythos Preview.

MDASH is not a single model but an agentic system that runs more than 100 specialized agents. Some agents hunt for vulnerabilities while others debate whether identified flaws are real or exploitable. The company said this approach reduces false positives.

In its first run against the Windows operating system, MDASH surfaced 16 previously unknown vulnerabilities. Four of those were critical remote-takeover flaws addressed in this month’s Patch Tuesday. The announcement came the same week that OpenAI announced its Daybreak security initiative.

MDASH was built by Microsoft’s Autonomous Code Security team. The team includes several members of Team Atlanta, which won the $29.5 million DARPA AI Cyber Challenge by building an autonomous cyber-reasoning system. The system found and patched bugs in open-source projects.

Taesoo Kim, VP of Security Research at Microsoft, leads the Autonomous Code Security team. Kim said the adoption of MDASH is rapid, with everyone internally at Microsoft leveraging it for security tests. The Windows team has integrated the tool into its entire build process and pipeline.

Microsoft MDASH Outperforms Mythos Preview on CyberGym Benchmark