Close Menu
  • Home
  • News
  • Cyber Security
  • Internet of Things
  • Tips and Advice

Subscribe to Updates

Get the latest creative news from FooBar about art, design and business.

What's Hot

HMRC Warns of Over 135,000 Scam Reports

December 18, 2025

Four Threat Clusters Using CastleLoader as GrayBravo Expands Its Malware Service Infrastructure

December 18, 2025

New “Lies-in-the-Loop” Attack Undermines AI Safety Dialogs

December 18, 2025
Facebook X (Twitter) Instagram
Thursday, December 18
Facebook X (Twitter) Instagram Pinterest Vimeo
Cyberwire Daily
  • Home
  • News
  • Cyber Security
  • Internet of Things
  • Tips and Advice
Cyberwire Daily
Home»News»New “Lies-in-the-Loop” Attack Undermines AI Safety Dialogs
News

New “Lies-in-the-Loop” Attack Undermines AI Safety Dialogs

Team-CWDBy Team-CWDDecember 18, 2025No Comments3 Mins Read
Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
Share
Facebook Twitter LinkedIn Pinterest Email


A novel attack technique that undermines a common safety mechanism in agentic AI systems has been detailed by security researchers, showing how human approval prompts can be manipulated to execute malicious code.

The issue, observed by Checkmarx researchers, centers on Human-in-the-Loop (HITL) dialogs, which are designed to ask users for confirmation before an AI agent performs potentially risky actions such as running operating system commands.

The research, published on Tuesday, describes how attackers can forge or manipulate these dialogs so they appear harmless, even though approving them triggers arbitrary code execution.

The technique, dubbed Lies-in-the-Loop (LITL), exploits the trust users place in confirmation prompts, turning a safeguard into an attack vector.

A New Attack Vector

The analysis expands on earlier work by showing that attackers are not limited to hiding malicious commands out of view. They can also prepend benign-looking text, tamper with metadata that summarizes the action being taken and exploit Markdown rendering flaws in user interfaces.

In some cases, injected content can alter how a dialog is displayed, making dangerous commands appear safe or replacing them with innocuous ones.

The problem is particularly acute for privileged AI agents such as code assistants, which often rely heavily on HITL dialogs and lack other defensive layers recommended by OWASP.

HITL prompts are cited by OWASP as mitigations for prompt injection and excessive agency, making their compromise especially concerning.

“Once the HITL dialog itself is compromised, the human safeguard becomes trivially easy to bypass,” the researchers wrote.

The attack can originate from indirect prompt injections that poison the agent’s context long before the dialog is shown.

Read more on AI agent security: AI Agents Need Security Training – Just Like Your Employees

Affected Tools and Mitigation Strategies

The research references demonstrations involving Claude Code and Microsoft Copilot Chat in VS Code.

In Claude Code, attackers were shown to tamper with dialog content and metadata. In Copilot Chat, improper Markdown sanitization allowed injected elements to render in ways that could mislead users after approval.

The disclosure timeline shows that Anthropic acknowledged reports in August 2025 but classified them as informational. Microsoft acknowledged a report in October 2025 and later marked it as completed without a fix, stating the behavior did not meet its criteria for a security vulnerability.

The researchers stress that no single fix can eliminate LITL attacks, but they recommend a defense-in-depth approach, including:

  • Improving user awareness and training

  • Strengthening visual clarity of approval dialogs

  • Validating and sanitizing inputs, including Markdown

  • Using safe OS APIs that separate commands from arguments

  • Applying guardrails and reasonable length limits to dialogs

“Developers adopting a defense-in-depth strategy with multiple protective layers […] can significantly reduce the risks for their users,” Checkmarx wrote.

“At the same time, users can strengthen resilience through greater awareness, attentiveness and a healthy degree of skepticism.”



Source

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Previous ArticleStorm-0249 Escalates Ransomware Attacks with ClickFix, Fileless PowerShell, and DLL Sideloading
Next Article Four Threat Clusters Using CastleLoader as GrayBravo Expands Its Malware Service Infrastructure
Team-CWD
  • Website

Related Posts

News

HMRC Warns of Over 135,000 Scam Reports

December 18, 2025
News

Four Threat Clusters Using CastleLoader as GrayBravo Expands Its Malware Service Infrastructure

December 18, 2025
News

Storm-0249 Escalates Ransomware Attacks with ClickFix, Fileless PowerShell, and DLL Sideloading

December 18, 2025
Add A Comment
Leave A Reply Cancel Reply

Latest News

North Korean Hackers Turn JSON Services into Covert Malware Delivery Channels

November 24, 202521 Views

macOS Stealer Campaign Uses “Cracked” App Lures to Bypass Apple Securi

September 7, 202517 Views

North Korean Hackers Exploit Threat Intel Platforms For Phishing

September 7, 20256 Views

U.S. Treasury Sanctions DPRK IT-Worker Scheme, Exposing $600K Crypto Transfers and $1M+ Profits

September 5, 20256 Views

Ukrainian Ransomware Fugitive Added to Europe’s Most Wanted

September 11, 20255 Views
Stay In Touch
  • Facebook
  • YouTube
  • TikTok
  • WhatsApp
  • Twitter
  • Instagram
Most Popular

North Korean Hackers Turn JSON Services into Covert Malware Delivery Channels

November 24, 202521 Views

macOS Stealer Campaign Uses “Cracked” App Lures to Bypass Apple Securi

September 7, 202517 Views

North Korean Hackers Exploit Threat Intel Platforms For Phishing

September 7, 20256 Views
Our Picks

Can password managers get hacked? Here’s what to know

November 14, 2025

Look out for phony verification pages spreading malware

September 14, 2025

Find your weak spots before attackers do

November 21, 2025

Subscribe to Updates

Get the latest news from cyberwiredaily.com

Facebook X (Twitter) Instagram Pinterest
  • Home
  • Contact
  • Privacy Policy
  • Terms of Use
  • California Consumer Privacy Act (CCPA)
© 2025 All rights reserved.

Type above and press Enter to search. Press Esc to cancel.