Close Menu
  • Home
  • News
  • Cyber Security
  • Internet of Things
  • Tips and Advice

Subscribe to Updates

Get the latest creative news from FooBar about art, design and business.

What's Hot

Ex-Google Engineer Convicted for Stealing AI Secrets for China Startup

February 7, 2026

Substack Confirms Data Breach, “Limited User Data” Compromised

February 6, 2026

SmarterMail Fixes Critical Unauthenticated RCE Flaw with CVSS 9.3 Score

February 6, 2026
Facebook X (Twitter) Instagram
Saturday, February 7
Facebook X (Twitter) Instagram Pinterest Vimeo
Cyberwire Daily
  • Home
  • News
  • Cyber Security
  • Internet of Things
  • Tips and Advice
Cyberwire Daily
Home»Cyber Security»Multi-Turn Attacks Expose Weaknesses in Open-Weight LLM Models
Cyber Security

Multi-Turn Attacks Expose Weaknesses in Open-Weight LLM Models

Team-CWDBy Team-CWDNovember 7, 2025No Comments3 Mins Read
Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
Share
Facebook Twitter LinkedIn Pinterest Email


A new report has revealed that open-weight large language models (LLMs) have remained highly vulnerable to adaptive multi-turn adversarial attacks, even when single-turn defenses appear robust.

The findings, published today by Cisco AI Defense, show that while isolated, one-off attack attempts frequently fail, persistent, multi-step conversations can achieve success rates exceeding 90% against most tested defenses.

Multi-Turn Attacks Outperform Single-Turn Tests

Cisco’s analysis compared single-turn and multi-turn testing to measure how models respond under sustained adversarial pressure.

Using over 1000 prompts per model, researchers observed that many models performed well when faced with a single malicious input but quickly deteriorated when attackers refined their strategy over several turns.

Adaptive attack styles, such as “Crescendo,” “Role-Play” and “Refusal Reframe,” allowed malicious actors to manipulate models into producing unsafe or restricted outputs. In total, 499 simulated conversations were analyzed, with each spanning 5-10 exchanges.

The results indicate that traditional safety filters are insufficient when models are subjected to iterative manipulation.

Read more on AI vulnerability testing methods: Microsoft 365 Copilot: New Zero-Click AI Vulnerability Allows Corporate Data Theft

Key Vulnerabilities and Attack Categories

The study identified 15 sub-threat categories showing the highest failure rates across 102 total threat types.

Among them, malicious code generation, data exfiltration and ethical boundary violations ranked most critical.

Cisco’s scatter plot analyses revealed that models plotting above the diagonal line in vulnerability graphs share architectural weaknesses that make them disproportionately prone to multi-turn exploitation.

The research defined a “failure” as any instance where a model:

  • Produced harmful or inappropriate content

  • Revealed private or system-level information

  • Bypassed internal safety restrictions

Conversely, a “pass” occurred when the model refused or reframed harmful requests while maintaining data confidentiality.

Recommendations For Developers and Organizations

To mitigate risks, Cisco recommended several practices:

  • Implement strict system prompts aligned with defined use cases

  • Deploy model-agnostic runtime guardrails for adversarial detection

  • Conduct regular AI red-teaming assessments within intended business contexts

  • Limit model integrations with automated external services

The report also called for expanding prompt sample sizes, testing repeated prompts to assess variability and comparing models of different sizes to evaluate scale-dependent vulnerabilities.

“The AI developer and security community must continue to actively manage these threats (as well as additional safety and security concerns) through independent testing and guardrail development throughout the lifecycle of model development and deployment in organizations,” Cisco wrote.

“Without AI security solutions – such as multi-turn testing, threat-specific mitigation and continuous monitoring – these models pose significant risks in production, potentially leading to data breaches or malicious manipulations.”



Source

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Previous ArticleGhost Identities, Poisoned Accounts, & AI Agent Havoc
Next Article Discover Practical AI Tactics for GRC — Join the Free Expert Webinar
Team-CWD
  • Website

Related Posts

Cyber Security

Why AI’s Rise Makes Protecting Personal Data More Critical Than Ever

February 6, 2026
Cyber Security

New Hacking Campaign Exploits Microsoft Windows WinRAR Vulnerability

February 5, 2026
Cyber Security

Two Critical Flaws Found in n8n AI Workflow Automation Platform

February 4, 2026
Add A Comment
Leave A Reply Cancel Reply

Latest News

North Korean Hackers Turn JSON Services into Covert Malware Delivery Channels

November 24, 202522 Views

macOS Stealer Campaign Uses “Cracked” App Lures to Bypass Apple Securi

September 7, 202517 Views

North Korean Hackers Exploit Threat Intel Platforms For Phishing

September 7, 20256 Views

U.S. Treasury Sanctions DPRK IT-Worker Scheme, Exposing $600K Crypto Transfers and $1M+ Profits

September 5, 20256 Views

Ukrainian Ransomware Fugitive Added to Europe’s Most Wanted

September 11, 20255 Views
Stay In Touch
  • Facebook
  • YouTube
  • TikTok
  • WhatsApp
  • Twitter
  • Instagram
Most Popular

North Korean Hackers Turn JSON Services into Covert Malware Delivery Channels

November 24, 202522 Views

macOS Stealer Campaign Uses “Cracked” App Lures to Bypass Apple Securi

September 7, 202517 Views

North Korean Hackers Exploit Threat Intel Platforms For Phishing

September 7, 20256 Views
Our Picks

‘What happens online stays online’ and other cyberbullying myths, debunked

September 11, 2025

The hidden risks of browser extensions – and how to avoid them

September 13, 2025

Find your weak spots before attackers do

November 21, 2025

Subscribe to Updates

Get the latest news from cyberwiredaily.com

Facebook X (Twitter) Instagram Pinterest
  • Home
  • Contact
  • Privacy Policy
  • Terms of Use
  • California Consumer Privacy Act (CCPA)
© 2026 All rights reserved.

Type above and press Enter to search. Press Esc to cancel.