All Articles
Published 6 min read

YOUR NEXT AI VISITOR WILL KNOW WHO SENT IT

AXOAgentic WebMCPGeminiBlended Retrieval
AUTHOR
Slobodan "Sani" Manic

SLOBODAN "SANI" MANIC

No Hacks

CXL-certified conversion specialist and WordPress Core Contributor helping companies optimise websites for both humans and AI agents.

The agent visiting your website knows the person who sent it.

That is the shift underneath Google's Gemini Deep Research Max, launched on April 21, 2026 as a public preview on the paid Gemini API tier. Deep Research Max itself is a narrow rollout. The pattern it ships is a preview of what the agentic web becomes when the other major vendors follow, which they typically do within a quarter or two on capabilities like this. When a blended-retrieval agent runs, it arrives with private context: the user's financial data, their file stores, their connected professional data streams, all fused into the query before the agent reaches any page.

For web professionals, this is the next chapter of the agentic-web story. The claim that agents are a new primary visitor class has held for months. As of this week, the claim evolves. Agents are a new primary visitor class with private context. The reasoning that decides whether your page answers a query runs on a larger input set than your page. The weight the agent gives your content depends on whether it adds anything the private sources did not already provide. This is the blended-retrieval moment in the agentic web story, and it lands on the supply side of how agents fetch, not on the user-facing product layer.

The old AI-search optimization posture (write content that matches the keyword query) was weakening before this. It weakens further now. The new posture is structural predictability: clean entity relationships, canonical identity, live data, rendering independence. Structure matters to the agent functionally. When the agent arrives with context, the content it picks is the content its model can fuse cleanly with everything else it already has.

GET WEEKLY WEB STRATEGY TIPS FOR THE AI AGE

Practical strategies for making your website work for AI agents and the humans using it. Podcast episodes, articles, videos. Plus exclusive tools, free for subscribers. No spam.

Blended retrieval previews the agentic web's next layer

Google's Gemini Deep Research Max, in public preview on the paid API tier from April 21, can pull from four input classes in a single reasoning loop: the public web, file uploads, connected file stores, and arbitrary remote MCP servers. From Google's own announcement, the agent "searches the web, arbitrary remote MCPs, file uploads and connected file stores, or any subset of them."

The two new classes (file stores and remote MCPs) share one property. They are private by default. The agent reads them only through user consent. Once connected, a financial data provider or an enterprise CRM exposes its data to Gemini through the Model Context Protocol, Anthropic's open standard with over 97 million installs as of March 2026. Google's agent retrieves from those private sources with the same reliability it reads the open web, inside the same reasoning pass.

This is the structural move everyone watching the agentic web has been waiting for a major vendor to ship: public web and private context, fused by the agent, inside a single query. Gemini is the first.

The pattern is also not here for most operators yet. Deep Research Max is a public preview behind a paid API, not a feature in the consumer Gemini app. Most websites will not be read by a blended-retrieval agent this quarter. What Google announced on April 21 is the direction, not the arrival. Treat it as a leading indicator: if this architecture scales, and major vendors generally copy each other within a quarter or two on capabilities like this, the operator work gets real before the traffic does.

Signal share collapses when the agent has better alternatives

In a blended-retrieval query, every connected source competes for signal share: the open web, the user's file stores, and any private MCP servers. The weight any single source gets is proportional to how cleanly the agent can extract and fuse its signal with everything else the agent is holding.

For public websites, this shifts the competitive terrain in two ways.

First, machine-first websites win more citation share. A page with clean structured data, unambiguous entity relationships, and rendering that does not hide content behind JavaScript is easy for the agent to merge with the user's private context. The fused answer references the machine-first page because that page contributed usable, mergeable material.

Second, poorly structured websites lose signal share they used to get for free. In a web-only era, even a messy page could surface in a citation because there was no better public-web alternative. In the blended-retrieval era, the alternative may be the user's uploaded documents or a connected MCP with cleaner data. The messy content page loses the citation share it used to split with clean sources.

This is a different competition than classical SEO. Classical SEO ranked pages against each other. Blended retrieval ranks pages against the user's own context. You cannot see the competing sources. You can only make sure that when the agent reaches your public page, the page contributes something extractable and unambiguous.

Structured Product and Offer schema gets cited more often than unstructured descriptions when the user's private context touches anything related. Canonical identity, clean entity relationships, and rendering independence all become higher-leverage when the agent is fusing signal across sources. The Adobe Q1 2026 AI traffic inversion was the demand-side proof that structured commerce wins in AI search; blended retrieval is the supply-side mechanism driving the same effect into the rest of the web.

The honest counter-read: some queries route around your website entirely

Not every blended-retrieval query will end up citing a public website. Some queries will be answerable entirely from the user's connected sources. A financial analyst running Deep Research Max over an internal MCP server plus uploaded quarterly reports may never need the public web for that answer. That query's traffic does not flow through anywhere; the answer is satisfied inside the private-context boundary.

This is a real subset. Most queries still blend public and private sources, because most analytical questions touch both.

Blended retrieval does not mean every website gets less traffic. It means the agent is choosier about what it uses. The bar rises for the sources the agent picks. Deep Research Max is a preview of what the agentic web is about to demand. Machine-first websites will pick up share when that scale arrives. Unstructured content will continue to lose it. Google showed us the pattern on April 21, but the scale that follows is where the real work for web professionals starts, and there is time to do that work before the traffic catches up.

QUESTIONS ANSWERED

What is blended retrieval in the context of AI agents?

Blended retrieval is an agent-architecture pattern where a single query pulls from multiple source classes simultaneously: the public web, the user's file uploads, connected file stores (like Google Drive), and arbitrary remote Model Context Protocol servers exposing private enterprise data. The agent reasons across all of them in one loop rather than querying each independently. Google's Gemini Deep Research Max is the first major production deployment of this pattern, launched on April 21, 2026.

How does blended retrieval change AI search optimization?

Blended retrieval means the agent arrives at a public website with context from the user's private data sources. The weight the agent gives your page depends on whether your content contributes something mergeable with the private signal already in play. Machine-first websites with clean structured data, unambiguous entity relationships, and rendering independence gain leverage. Poorly structured websites lose signal share they used to get by default in a web-only retrieval system.

What is MCP, and why does it matter for the agentic web?

Model Context Protocol (MCP) is Anthropic's open standard for connecting AI agents to external data sources, published in late 2024 and with over 97 million installs as of March 2026. MCP matters for the agentic web because it standardizes how agents access private and proprietary data alongside public-web retrieval. Google, Anthropic, and a growing list of enterprise data providers have adopted it, which is what makes blended retrieval possible at scale.

Does every agent query now cite a public website?

No. Some blended-retrieval queries will be answerable entirely from the user's connected MCP servers and file stores without reaching the public web. Most analytical queries still touch both public and private sources, because most decision-relevant data still lives on public websites. The subset of fully-private queries is bounded by how much of the user's topic lives inside their own connected sources.

NEW TO NO HACKS?

Practical strategies for making your website work for AI agents and the humans using it. Read by SEOs, developers, and AI researchers. Exclusive tools, free for subscribers.