technology

Microsoft Open-Sources ASSERT, a Framework for Turning Natural-Language AI Policies Into Scored Tests

Microsoft launched ASSERT on Tuesday, an open source framework that converts natural-language descriptions of AI behavior into scored tests. The tool records system paths and supports continuous monitoring for application-specific policies.

Jun 2, 3:02 PM(45 days ago)·1m read1 source

Microsoft Open-Sources ASSERT, a Framework for Turning Natural-Language AI Policies Into Scored Tests

Audio version

Tap play to generate a narrated version.

Developing·Limited corroboration so far. This page will refresh as more sources emerge.

Microsoft released an open source framework called Adaptive Spec-driven Scoring for Evaluation and Regression Testing on Tuesday. The framework is abbreviated as ASSERT. ASSERT turns high-level natural-language descriptions of goals, policies, or intended behaviors into thorough scored tests.

It takes plain-language descriptions of an AI model’s expected behavior and policies and turns them into a structured set of acceptable and unacceptable behaviors. ASSERT generates problem scenarios and test cases, runs them against the target system, and scores the results. It can record the paths the AI system takes, including intermediate actions and tool calls.

Developers can provide system context, tools, and constraints to customize evaluations in ASSERT. For example, a developer could specify that a document research AI agent should not send emails to people outside the company and should limit confidential information to C-level executives. Sarah Bird, chief product officer of Responsible AI at Microsoft, said evaluations are critical.

“One of the things we’ve learned is that evaluations are absolutely critical to making good decisions,” she said. “Because if you don’t understand the behavior of the AI system, it’s really hard to know if it’s meeting your organization’s bar,” Bird said. ” Bird said ASSERT can be used to evaluate systems when they are being built, after deployment, and for continuous monitoring.

TechCrunch reported that the framework fills a gap that broader evaluations cannot address when AI models must follow application-specific policies.

technology

Microsoft Open-Sources ASSERT, a Framework for Turning Natural-Language AI Policies Into Scored Tests

Jun 2, 3:02 PM(45 days ago)·1m read1 source

Audio version

Tap play to generate a narrated version.

Developing·Limited corroboration so far. This page will refresh as more sources emerge.

TechCrunch reported that the framework fills a gap that broader evaluations cannot address when AI models must follow application-specific policies.

Microsoft Open-Sources ASSERT, a Framework for Turning Natural-Language AI Policies Into Scored Tests

Microsoft Open-Sources ASSERT, a Framework for Turning Natural-Language AI Policies Into Scored Tests

Transparency

Story details

Related Stories

Meta Discusses Leasing AI Data Center Capacity to Anthropic

AWS Billing Glitch Shows Some Customers Bills of Billions

Trump Media to Launch Paid Truth API for Real-Time Posts

Related Stories

Meta Discusses Leasing AI Data Center Capacity to Anthropic

AWS Billing Glitch Shows Some Customers Bills of Billions

Trump Media to Launch Paid Truth API for Real-Time Posts