All Articles
Published 5 min read

THE WEBMCP TOOLS YOU EXPOSE TO AGENTS CAN BE USED TO HIJACK THEM

WebMCPAI AgentsMachine-First ArchitectureChromeAXO
AUTHOR
Slobodan "Sani" Manic

SLOBODAN "SANI" MANIC

No Hacks

CXL-certified conversion specialist and WordPress Core Contributor helping companies optimise websites for both humans and AI agents.

Add WebMCP to your website and you hand visiting AI agents a set of named tools to call. Those same tools can be used to turn the agents against the people who sent them. Chrome's developer site now carries the security guidance for WebMCP, and much of it is written for the websites exposing the tools rather than the companies building the agents. Make your website agent-ready with WebMCP and you have also opened an attack surface, and closing it is your job, not the agent's.

For two years the agent-readiness conversation has been about access: can an agent reach your content, read your page, finish your checkout. WebMCP is the version where you stop hoping an agent figures your website out from the markup and start handing it named tools to call. That is the more useful protocol, and it is the direction the agentic web's protocol layer is moving. It is also where being legible to an agent and being safe for an agent stop being the same property.

GET WEEKLY WEB STRATEGY TIPS FOR THE AI AGE

Practical strategies for making your website work for AI agents and the humans using it. Podcast episodes, articles, videos. Plus exclusive tools, free for subscribers. No spam.

Chrome Named Two Ways Agents Get Hijacked Through WebMCP

Chrome's agent-security guidance describes two attack vectors, and both arrive through the tools a website exposes. The first is the malicious manifest. In Chrome's words, "Websites may have tool definitions with hidden instructions, in tool names, parameters, or descriptions, designed to hijack the agent." A tool's description is text the agent reads to decide how to use the tool, so a description can carry an instruction the agent was never meant to follow.

The second vector is the one most websites will actually hit, and it needs no malicious website at all. Chrome calls it a contaminated output: "Real-time tool responses from otherwise trustworthy sites might include malicious instructions as part of third-party data, such as user comments." A tool on your own website that returns your product reviews, your comment threads, your forum posts, or your support replies is returning text other people wrote. If one of those people planted an instruction inside a review, your legitimate tool has handed it to the agent as if it came from you. The payload is your own user-generated content, and you invited it in.

This works because of something that is not a bug and will not be patched. "LLMs treat all text, instructions and user data, as a single sequence of tokens," the guidance says, so the model cannot reliably separate the part you meant as data from the part an attacker meant as a command. That is why Chrome says "the probabilistic nature of LLMs makes it impossible to guarantee safety inside the model itself." This is the same prompt-injection problem that has no clean fix inside the model, now wearing a protocol. WebMCP gives that attack a clean, structured delivery route through the tools you published on purpose.

Making A Website Agent-Ready Now Includes Making It Agent-Safe

Chrome's guidance puts the obligation on the website, not only on the agent. Chrome's tool-security document opens with a line aimed straight at whoever exposes the tools: "Only expose your tools to origins that you trust. This is particularly important when tools manage user data or otherwise impact the user." That line is written for whoever ships the tool. That means you.

The defenses are concrete, and they are annotations you attach to the tools you ship. untrustedContentHint "explicitly labels the payload as untrusted, to help protect your site's integrity while providing a signal to the agent that this data requires heightened scrutiny," and Chrome says when to use it: "If a tool returns user-generated content (UGC) or externally sourced data, consider adding the untrustedContentHint to the tool." readOnlyHint marks a tool that does not change state, which "allows the agent to make better decisions about when to ask for user confirmations." exposedTo restricts a tool to an array of origins you trust, written into the registration itself:

document.modelContext.registerTool({...}, {
  exposedTo: ['https://trusted.com']
});

Chrome caps the character budgets too, a tool description at 500 characters and a single tool output at roughly 1,500, and adds a requestUserInteraction() path for confirming an action before it fires. Take the obvious example, a tool that surfaces product reviews to a shopping agent. Securing it is not exotic work: mark its output with untrustedContentHint, set readOnlyHint because it reads rather than buys, and limit exposedTo to the origins you actually serve. None of that is the agent's job. It is the tool author's job, which on most teams is the web, CRO, or marketing people adding WebMCP to look current, not the security people who read threat models. That gap is where this goes wrong. Marking which of your content is data and not commands is now part of shipping a tool, the way sanitizing input became part of shipping a form.

Adopt WebMCP, But Threat-Model Every Tool First

Handing an agent explicit, callable tools beats making it guess your website from the DOM, and the capability is worth having. None of this is a reason to avoid WebMCP. The point is narrower and more boring than "new protocol, new danger": the capability arrives with a bill attached, and the bill is yours.

So the line is simple. Do not expose a tool to an agent that you have not threat-modeled the way you would threat-model a public API endpoint. For every tool you are about to register, answer one question before it ships: what untrusted content can this return, and have you marked it. If you cannot answer that, the tool is not ready, however agent-ready the rest of your website looks.

WebMCP is early. It sits in a Chrome origin trial, the specification is still moving, and most websites have not exposed a single tool. That is the window to decide agent-safe is part of agent-ready, before the first tool you ship turns out to be the one that hands an agent your reviews and whatever someone hid inside them.

QUESTIONS ANSWERED

Can WebMCP be used to hijack an AI agent?

Yes. Chrome's security guidance describes two routes: a tool manifest with hidden instructions in its name or description, and a contaminated output where a tool returns third-party content such as user comments with an instruction embedded. The agent can act on it because the model reads instructions and data as one stream.

Who is responsible for securing WebMCP tools?

The website exposing the tools, not only the agent. Chrome tells tool authors to expose tools only to trusted origins, mark untrusted outputs with untrustedContentHint, set readOnlyHint on read-only tools, and confirm state-changing actions with the user. These annotations are attached when the tool is registered.

What is untrustedContentHint in WebMCP?

untrustedContentHint is a WebMCP annotation that labels a tool's output as untrusted, signaling the agent to treat that content with heightened scrutiny instead of as instructions. Chrome recommends adding it whenever a tool returns user-generated content or externally sourced data, such as product reviews or comments.

NEW TO NO HACKS?

Practical strategies for making your website work for AI agents and the humans using it. Read by SEOs, developers, and AI researchers. Exclusive tools, free for subscribers.