Close Menu
  • Home
  • News
  • Cyber Security
  • Internet of Things
  • Tips and Advice

Subscribe to Updates

Get the latest creative news from FooBar about art, design and business.

What's Hot

Ex-Google Engineer Convicted for Stealing AI Secrets for China Startup

February 7, 2026

Substack Confirms Data Breach, “Limited User Data” Compromised

February 6, 2026

SmarterMail Fixes Critical Unauthenticated RCE Flaw with CVSS 9.3 Score

February 6, 2026
Facebook X (Twitter) Instagram
Saturday, February 7
Facebook X (Twitter) Instagram Pinterest Vimeo
Cyberwire Daily
  • Home
  • News
  • Cyber Security
  • Internet of Things
  • Tips and Advice
Cyberwire Daily
Home»Cyber Security»Open Source “b3” Benchmark to Boost LLM Security for Agents
Cyber Security

Open Source “b3” Benchmark to Boost LLM Security for Agents

Team-CWDBy Team-CWDOctober 30, 2025No Comments3 Mins Read
Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
Share
Facebook Twitter LinkedIn Pinterest Email


The UK AI Security Institute (AISI) has partnered with the commercial security sector on a new open source framework designed to help large language model (LLM) developers improve security posture.

The backbone breaker benchmark (b3) is a new evaluation tool created by the AISI, Check Point and Check Point subsidiary Lakera. It’s designed to help developers and model providers improve the resilience of the “backbone” LLMs which power AI agents.

“AI agents operate as a chain of stateless LLM calls – each step performing reasoning, producing output, or invoking tools,” Lakera explained in a blog post announcing the release.

“Instead of evaluating these full agent workflows end-to-end, b3 zooms in on the individual steps where the backbone LLM actually fails: the specific moments when a prompt, file, or web input triggers a malicious output. These are the pressure points attackers exploit – not the agent architecture itself, but the vulnerable LLM calls within it.”

To help developers and model providers uncover these vulnerabilities before their adversaries do, b3 uses a new technique called “threat snapshots.” These micro tests are powered by crowdsourced adversarial data from Lakera’s “Gandalf: Agent Breaker” initiative.

Specifically, b3 combines 10 representative agent “threat snapshots” with a high-quality dataset of 19,433 Gandalf adversarial attacks. Developers can then use it to see how vulnerable their model is to attacks such as system prompt exfiltration, phishing link insertion, malicious code injection, denial-of-service and unauthorized tool calls.

Read more on AI agent security: AI Chatbots Highly Vulnerable to Jailbreaks, UK Researchers Find

The b3 benchmark “makes LLM security measurable, reproducible, and comparable across models and application categories,” according to Lakera.

“B3 lets us finally see which ‘backbones’ are most resilient in a given application, and what separates strong models from those that fail under pressure,” it said.

“Along the way, the results revealed two striking patterns: models that reason step by step tend to be more secure, and open-weight models are closing the gap with closed systems faster than expected.”

A Baseline For Improving LLM Security

Mateo Rojas-Carulla, co-founder and chief scientist at Lakera, argued that today’s AI agents are only as secure as the LLMs they’re powered by.

“Threat Snapshots allow us to systematically surface vulnerabilities that have until now remained hidden in complex agent workflows,” he added.

“By making this benchmark open to the world, we hope to equip developers and model providers with a realistic way to measure, and improve, their security posture.”

Andrew Bolster, senior research & development manager (data science) at Black Duck, gave a cautious welcome to the new open source benchmark.

“This type of research is a great baseline for agentic integrators to understand the threat model around these systems,” he argued.

“But for true-scale security with AI in the mix, security leaders need to leverage both these novel prompt manipulation/benchmarking techniques, as well as battle-tested application security testing and model attestation regimes.”



Source

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Previous ArticleSecuring AI to Benefit from AI
Next Article PolarEdge Targets Cisco, ASUS, QNAP, Synology Routers in Expanding Botnet Campaign
Team-CWD
  • Website

Related Posts

Cyber Security

Why AI’s Rise Makes Protecting Personal Data More Critical Than Ever

February 6, 2026
Cyber Security

New Hacking Campaign Exploits Microsoft Windows WinRAR Vulnerability

February 5, 2026
Cyber Security

Two Critical Flaws Found in n8n AI Workflow Automation Platform

February 4, 2026
Add A Comment
Leave A Reply Cancel Reply

Latest News

North Korean Hackers Turn JSON Services into Covert Malware Delivery Channels

November 24, 202522 Views

macOS Stealer Campaign Uses “Cracked” App Lures to Bypass Apple Securi

September 7, 202517 Views

North Korean Hackers Exploit Threat Intel Platforms For Phishing

September 7, 20256 Views

U.S. Treasury Sanctions DPRK IT-Worker Scheme, Exposing $600K Crypto Transfers and $1M+ Profits

September 5, 20256 Views

Ukrainian Ransomware Fugitive Added to Europe’s Most Wanted

September 11, 20255 Views
Stay In Touch
  • Facebook
  • YouTube
  • TikTok
  • WhatsApp
  • Twitter
  • Instagram
Most Popular

North Korean Hackers Turn JSON Services into Covert Malware Delivery Channels

November 24, 202522 Views

macOS Stealer Campaign Uses “Cracked” App Lures to Bypass Apple Securi

September 7, 202517 Views

North Korean Hackers Exploit Threat Intel Platforms For Phishing

September 7, 20256 Views
Our Picks

2025’s most common passwords were as predictable as ever

January 21, 2026

The hidden risks of browser extensions – and how to avoid them

September 13, 2025

Why LinkedIn is a hunting ground for threat actors – and how to protect yourself

January 16, 2026

Subscribe to Updates

Get the latest news from cyberwiredaily.com

Facebook X (Twitter) Instagram Pinterest
  • Home
  • Contact
  • Privacy Policy
  • Terms of Use
  • California Consumer Privacy Act (CCPA)
© 2026 All rights reserved.

Type above and press Enter to search. Press Esc to cancel.