Topic
benchmark-testing
1 stories related to this topic, newest first.
forbes.comtechnology14 days agoDeveloping
Microsoft MDASH Outperforms Mythos Preview on CyberGym Benchmark
Microsoft announced MDASH, a multi-modal agentic scanning harness for vulnerability discovery and remediation. The system scored 88.4 percent on the CyberGym benchmark compared with 83.1 percent for Mythos Preview. MDASH uses more than 100 specialized agents and is in limited pri…