Home Blog Marketing AI Search OptimizationHow to Win in Agentic Search: The Complete Agentic Search Optimization Checklist

How to Win in Agentic Search: The Complete Agentic Search Optimization Checklist

Author: Limor Barenholtz , Director of SEO & AI Search at Similarweb

Reviewer: Roi Kaufman 21 Min. June 1, 2026

AI agent traffic grew 7,851% year-over-year in 2025, according to the HUMAN Security 2026 State of AI Traffic and Cyberthreat Benchmark Report, which analyzed more than one quadrillion interactions. As of May 2026, fewer than 5% of websites are optimized to receive it, according to Digidop’s May 2026 domain audit. That is not a content gap. It is a technical one.

Most brands now investing in agentic search aren’t starting from scratch. They’ve done the AI search optimization work: they show up in AI-generated answers, they get cited in ChatGPT and Perplexity. That’s visibility. But a citation is not a handoff. The moment an agent follows that link and actually tries to do something, find pricing, book a demo, or run a comparison, it hits a wall that GEO and AEO were never designed to address. That wall is the domain of agentic search optimization.

What happens next lives at the technical layer.

The agents are already evaluating your site. On most sites, they fail to complete the task and switch to a competitor whose HTML provides them with what they need.

In this article, I’ll walk through the CARE framework: the four sequential layers every website must clear to win in agentic search. Crawlability, Accessibility, Readability, Executability. I’ll also cover how to measure agentic search performance and give you a 21-point audit checklist you can run today.

What agentic search actually is (and why winning it requires more than GEO)

Agentic search is the paradigm in which autonomous AI agents conduct searches, evaluate websites, and complete tasks on behalf of users, without requiring the user to browse manually. The user provides intent. The agent does the work: planning a research process, running it across the live web, making judgment calls, and (increasingly) acting on the result.

Before I get into the framework, it is worth being precise about what agentic search optimization actually requires, because in my experience, most practitioners treat it as an extension of generative engine optimization and solve only two-thirds of the problem. GEO earns you a citation in the answer. Agentic search optimization determines what happens when the agent follows that citation.

The table below maps all four optimization disciplines in one place:

Dimension	SEO	AEO	GEO	Agentic Search Optimization
Primary audience	Search engine crawlers	Answer engines (AI Overviews, PAA)	Generative AI (ChatGPT, Perplexity, Gemini)	Autonomous AI agents
Optimization target	Rankings in SERPs	Featured in AI-generated answers	Cited in LLM-generated responses	Usable by agents for task completion
Success metric	Organic traffic, rank position	Zero-click visibility, answer box presence	Citation frequency, brand mention rate	Agent task completion, session depth
Primary signals	Backlinks, crawlability, authority	Structured answers, FAQ schema, BLUF content	Semantic coverage, authority signals, FAN-out coverage	robots.txt permissions, llms.txt, WebMCP tool contracts, Schema.org Actions
Failure mode	Algorithm penalty, deindexing	Extractable answer exists elsewhere	Competitor cited instead	Agent cannot parse or complete a task on your site

One clarification worth making explicit: all agentic search is AI search, but not all AI search is agentic.

AI search is the broader category that includes any AI-shaped discovery, from Google AI Overviews to ChatGPT responses. Agentic search is the subset where the AI researches, decides, and may act. The table above describes where agentic search optimization fits within the discipline, it does not imply that these are four equivalent peer categories.

The practical implication: a site can be fully SEO-optimized, AEO-structured, and GEO-cited, yet still fail every agentic search check. A user who delegates research to an AI agent does not get a citation if the agent cannot complete the task. They get a recommendation for the competitor whose site the agent could actually use.

Understanding what AI agents actually do when they visit a site clarifies why the optimization requirements differ. Agents read the accessibility tree: the same semantic representation a screen reader uses. Headings, links, buttons, forms, ARIA roles. They do not see your hero image. They do not hover over your navigation.

Google’s official guidance on building agent-friendly websites, published by Google engineers (April 2026), confirms that agents operate across three input modalities: accessibility tree, raw HTML, and screenshots, with screenshots as the fallback rather than the default because they are computationally expensive.

The practical implication, as the same guidance states, “Everything we suggest to make a site agent-ready also makes sites better for humans.”

As of May 2026, even the top 100 websites average only a 55% agent readiness score, and 99% fail basic content negotiation. The bar is low. The window to clear it before competitors do is still open.

The CARE framework: four layers every site must clear to win in agentic search

The CARE framework is a four-layer technical checklist for agentic search optimization: Crawlability, Accessibility, Readability, and Executability. Each layer addresses a distinct prerequisite for making a website usable by autonomous AI agents.

I built the CARE framework to organize the technical layer of agentic search optimization into four sequential prerequisites. Each layer must be cleared before the next one adds value. A Layer 3 fix applied to a site that has not cleared Layer 0 achieves nothing, and I have seen enough audits to confirm that teams consistently skip this sequencing. Work through them in order.

CARE covers what your website needs to do to be usable by agents. It is the technical foundation for the first, second, and fourth layers of the FACT framework (Find, Analyze, Corroborate, Trigger): the complete agentic search optimization model.

The one FACT layer CARE does not address is Corroborate: whether independent third-party sources (review platforms, comparison content, community discussions) support your positioning. That layer is a brand and content strategy problem, not a technical one, and it is covered in detail in the agentic search guide.

Worth noting: Google’s Chrome Lighthouse agentic browsing audit, published May 2026, independently organizes its checks across the same four technical areas: accessibility (llms.txt), executability (three WebMCP checks), readability (accessibility for agents), and layout stability.

The CARE layer prioritization and Google’s own audit structure independently arrive at the same order.

C: Crawlability – Can AI agents reach your site at all?
A: Accessibility – Can agents find what matters on your site?
R: Readability – Can agents parse and extract your content?
E: Executability – Can agents complete a task on your site?

Layer 0: Crawlability

Crawlability is the prerequisite for everything else, and it is also the layer where I find the most preventable mistakes. Most crawlability failures in 2026 fall into one of two categories, and they require separate fixes.

Category 1: An explicit DISALLOW rule in your robots.txt

If GPTBot, PerplexityBot, ClaudeBot, or Google-Extended appear under a Disallow directive in your robots.txt (whether targeted or caught by a wildcard User-agent: * block), those bots will not crawl your site. GPTBot and OAI-SearchBot comply with robots.txt directives per OpenAI’s official documentation.

The fix is to remove or override the DISALLOW rule for these specific agents.

Note: if a bot is not listed in your robots.txt, it is permitted by default. You do not need to add an explicit Allow: / entry. You only need to remove an existing block.

Category 2: Cloudflare network-level blocking

Network-level blocking operates completely independently of robots.txt.

In July 2024, Cloudflare introduced a one-click AI bot blocking feature available to all customers. More than 1 million customers have enabled it, according to Cloudflare’s own update published in January 2026. Cloudflare serves 22.4% of all websites globally as of May 2026 (W3Techs, May 2026).

An agent can read Allow: / in your robots.txt and still be hard-blocked at the CDN edge before the request reaches your server. These are two separate control layers requiring two separate checks. As Similarweb’s technical GEO guide covers in detail, clearing both is the baseline for Layer 0 Crawlability.

One important nuance on user-triggered agents: OpenAI’s ChatGPT-User, the agent that executes real-time browsing when a user asks ChatGPT to visit a page, is no longer subject to robots.txt compliance. OpenAI’s December 2025 documentation update removed robots.txt compliance language specifically for ChatGPT-User, classifying it as a technical extension of a human user rather than an autonomous crawler.

GPTBot and OAI-SearchBot still respect robots.txt. ChatGPT-User does not.

This means user-initiated agentic sessions are already arriving at your site regardless of your crawlability configuration. This makes Layers 2 and 3 (readability and executability) meaningful for every site, not just those that have cleared Layer 0.

The bot distribution data tells you exactly where to focus.

OpenAI’s family of bots accounts for approximately 69% of all observed AI-driven traffic by volume. Anthropic identities account for roughly 11%. Meta-ExternalAgent contributes an additional 16% (HUMAN Security, 2026).

Bot family	Share of AI-driven traffic
OpenAI (GPTBot, OAI-SearchBot, ChatGPT-User)	~69%
Meta-ExternalAgent	~16%
Anthropic (ClaudeBot)	~11%
Other	~4%

An April 2026 robots.txt analysis found GPTBot is the most-blocked AI crawler of any type, appearing in more DISALLOW rules than any other AI bot. Crawlability policy decisions about a small number of bot identities have outsized effects on your total agentic search exposure.

Check	What to verify	Fix if failing
robots.txt: GPTBot	Search for GPTBot in your robots.txt	Remove any Disallow rule. No explicit Allow needed if not named
robots.txt: PerplexityBot	Same	Same
robots.txt: ClaudeBot	Same	Same
robots.txt: Google-Extended	Same	Same (check separately if site opted out of AI training)
robots.txt: wildcard catch	Check if User-agent: * carries a blanket Disallow that covers AI bots	Add specific Allow entries for each bot above the wildcard block
Cloudflare network blocking	Cloudflare dashboard: Security > Bots > Bot Fight Mode and “AI Crawl Control”	Disable AI bot blocking or set to Allow for specific bot identities
Server-side rendering	Key content exists in HTML source, not only in the JS bundle	Ensure pricing, specs, and feature content render server-side

One practical note: many sites unintentionally block GPTBot via WAF rules or rate limits, returning 429 Too Many Requests responses even when robots.txt permits access. Verify against OpenAI’s published IP ranges and whitelist them at the firewall level if needed.

Layer 1: Accessibility

Once agents can crawl your site, the next question is whether they can navigate it efficiently. A human gets a nav bar, a search function, and a visual hierarchy. An agent gets your HTML. For a 200-page enterprise site, that is an unreasonably large surface area to parse when the agent is looking for one specific capability.

Accessibility is about creating shortcuts that tell agents exactly where to look.

Agent accessibility shortcut 1: llms.txt

llms.txt is a plain-text file at your site root that lists 5 to 20 URLs you consider your most authoritative and agent-relevant pages. Google’s Chrome Lighthouse agentic browsing documentation (May 2026) states the rationale directly: without this file, agents may spend more time crawling the site to understand its high-level structure and primary content.

It is a navigation shortcut that reduces the token cost of agent orientation on your site. Similarweb’s llms.txt guide covers implementation in detail.

Three things to get right before you implement:

Google has confirmed that llms.txt has no effect on Google Search or AI Overviews rankings.
The Lighthouse audit marks a missing llms.txt as Not Applicable rather than a failure, meaning absence is not penalized, but a broken file that returns a server error is flagged. Having a misconfigured llms.txt is actively worse than not having one.
The strongest documented use cases are documentation-heavy and API-driven sites where AI coding tools (Cursor, GitHub Copilot) retrieve it in real time. For most B2B marketing sites, it is a low-effort navigation improvement worth implementing rather than a traffic lever.

Agent accessibility shortcut 2: XML sitemap coverage

A current, accurate sitemap remains an agent’s fallback navigation tool when llms.txt isn’t present or doesn’t cover a page. Pages missing from your sitemap or excluded via noindex are typically invisible to agents. Review coverage, particularly for high-value product and pricing pages.

Agent accessibility shortcut 3: ai-agent.json and agent readiness headers

Ai-agent.json is a machine-readable manifest published at /.well-known/ai-agent.json that declares your site’s agent-relevant capabilities: what content you publish, what tools you expose, and how agents should interact with your site.

It is part of a broader set of emerging agent readiness signals that also includes content negotiation headers: when a server responds to an Accept: text/markdown request with clean Markdown instead of HTML, it reduces agent token consumption by up to 80%. As of May 2026, only 3.9% of sites support this, and advanced agent capability declarations remain in the single digits globally, according to Cloudflare’s analysis.

But these are exactly the signals Cloudflare’s isitagentready.com scanner and AgentGrade check for, and early adoption puts you ahead of a standard that is actively being formalized. The implementation cost is low: publishing the manifest as a single file and configuring content negotiation are server configuration changes.

Layer 2: Readability

Readability is where most sites lose the agent, even after passing Layers 0 and 1. Crawlability and accessibility get the agent to the right page. Readability determines whether the agent can extract what it came for.

Semantic HTML and ARIA roles

Google’s web.dev guidance (April 2026) specifies the exact HTML practices that improve agent performance: use semantic elements such as <button> and <a> rather than styled <div> elements, keep layouts stable across pages, link <label> tags to inputs via the for attribute, and set cursor: pointer on clickable elements.

The accessibility tree is your agent’s primary navigation interface. If a screen reader struggles with your page, an agent will too. Both parse the same underlying structure.

Schema.org structured data

Structured markup provides explicit machine-readable metadata that agents can extract without inferring it from prose.

Google’s Search team confirmed at the April 2025 SMX event that structured data gives a direct advantage for AI search results. Microsoft’s Bing team has confirmed that schema markup helps its LLMs understand content for Copilot, consistent with Bing’s structured data guidelines.

The most relevant schema types for agentic search performance in 2026:

Schema type	What it signals to agents	Priority	Evidence basis
Product	Price, availability, specifications, reviews	High for e-commerce and SaaS	Google and Microsoft confirmed
HowTo	Step-by-step processes with outcomes	High for instructional content	Google confirmed
Organization	Contact info, service area, legitimacy signals	Essential for all sites	Google confirmed
Action	What actions your site supports: BookAction, BuyAction, ReserveAction	Critical for transactional sites	Emerging standard
Article	Authorship, date, headline for editorial content	High for blog and content sites	Google confirmed

Note: FAQPage structured data no longer generates rich results in Google Search as of May 7, 2026, per Google Search Central documentation. It remains valid markup for non-Google-agent consumption, but should no longer be cited as a rich-results tactic.

Content ungated and parseable

Agents cannot read PDFs, images of text, or content locked behind a lead-generation form. If your technical specifications, integration documentation, or pricing structure are in a PDF or require form submission to access, agents skip you in favor of competitors who surface this information as HTML.

Gating evaluation content in 2026 does not protect your pipeline. It removes you from it.

BLUF paragraph structure

Bottom Line Up Front is the writing pattern that makes each section independently extractable. The first 30 to 60 words of each H2 should fully answer the section’s question without requiring any surrounding context.

This is the same structure that earns GEO citations, which is why readability improvements compound in both agentic search performance and AI search visibility.

Machine-readable pricing

Agents assigned a procurement task need clear, structured pricing data. A pricing page that requires a human to scroll, compare, and mentally compute the right plan fails this test.

Structured pricing markup and plaintext tables with explicitly labeled plan tiers, features, and costs are the baseline.

Layer 3: Executability

A May 2026 audit of 10 mid-market e-commerce sites tested with a shopping agent found the same three failure patterns repeating across sites: cookie banners blocking the viewport before content could be read, add-to-cart buttons requiring a login the agent could not complete, and inventory data in HTML that did not match actual stock levels. In each case, the agent recommended the brand, the user confirmed the task, and the transaction failed at execution. Layer 3 is where that failure happens.

Executability is where the agentic search paradigm shift completes. Layers 0 through 2 get an agent into your site and enable it to understand it. Layer 3 determines whether the agent can act on behalf of the user who sent it.

WebMCP tool contracts

WebMCP (Web Model Context Protocol) is a proposed open web standard, jointly developed by Google and Microsoft, that lets websites expose structured tools to AI agents via navigator.modelContext.

Rather than an agent guessing how to use your site by analyzing screenshots or parsing the DOM, a WebMCP-enabled site declares its capabilities explicitly: what actions are available, what parameters they accept, and what they return. Google published official WebMCP documentation on May 18, 2026, and opened a public origin trial in Chrome 149 at Google I/O 2026. The stated design goals are “higher accuracy for agentic task completion, lower hallucination rates through explicit JSON Schemas.” Similarweb’s WebMCP guide covers the full SEO and GEO implications.

Two practical notes on current limitations: WebMCP requires an active browser tab and does not yet support headless-agent execution. Implementation on complex interfaces requires refactoring existing JavaScript. It is entering a production trial, not yet a stable standard.

Sites that implement and enroll in the Chrome 149 origin trial now will have real task-completion data before competitors understand the question is being asked.

Structured data and WebMCP are related but not the same thing: Action schema tells agents what your page claims to support. WebMCP gives them the callable tool to actually do it.

Clear CTA labeling in HTML

Even without full WebMCP implementation, primary CTAs need meaningful HTML semantics. A button labeled “Start” with no ARIA context tells an agent nothing about what starting means. A button with aria-label=”Start your free trial of [Product]” and type=”submit” gives an agent enough to work with.

This is a one-sprint fix with measurable impact on agent task completion.

OAuth 2.0 support (RFC 9728)

For agents acting on behalf of users who need authenticated access, RFC 9728 defines the standard way to route agents through an OAuth flow to request permission to act on an account. Cloudflare Access announced full support for this flow at Agents Week 2026.

Implementing it is primarily a backend authentication concern, but it is the mechanism that enables agents to handle protected resources on a user’s behalf without breaking the session.

x402 and agentic payment protocols

For transactional sites, the x402 protocol revives the HTTP 402 Payment Required status code to enable agent-native payments: an agent requests a resource, the server responds with a 402 and a machine-readable payment specification, the agent pays and retries. Stripe’s Agentic Commerce Suite and the Universal Commerce Protocol operate on similar principles.

Essential for e-commerce and marketplace sites targeting the agentic commerce channel, less urgent for most B2B SaaS sites in 2026, but worth understanding before the infrastructure reaches you.

The business case: winning in agentic search is a revenue question

Winning in agentic search is not a technical nicety. It is access to a channel that is already outperforming every other traffic source on conversion, and the data to prove it is Similarweb’s own.

As of January 2026, ChatGPT referral traffic converts ecommerce visitors at 7.1%, higher than direct traffic (6.7%) and nearly double organic search (4.1%), according to Similarweb’s 2026 Holiday Planning report, which analyzed conversion rates across US ecommerce sites.

The reason is intent compression.

By the time a user clicks through from an AI tool, discovery, comparison, and validation have already occurred within the conversation. The referral is not the start of the journey. It is the final step before purchase.

That conversion advantage reflects a structural shift in how consumers research. According to the same Similarweb report, around one in five US consumers used AI tools for product research during the 2025 holiday season, already a third of the scale of search engine usage for gifting research, and growing fast.

When asked how they used AI, the top responses were comparing products (59.5%), finding the best price (57.9%), and getting gift ideas (49.2%). These are not passive discovery behaviors. Those are high-intent evaluation tasks that end in purchase decisions.

Similarweb’s data from January 2026 shows AI tools are rated most useful at the product discovery stage by 35% of consumers, compared to 13.6% for traditional search. At the evaluation stage, where consumers narrow choices and assess value, AI leads search 32.9% to 15%. Even at the final stage of finding where to buy and the best price, AI tools (24.3%) are closing in on search engines (22.1%).

The funnel is being compressed from both ends simultaneously.

Looking ahead, 37.1% of US consumers expect to use AI more for holiday shopping in 2026, versus only 8.7% who expect to use it less, according to Similarweb data. That trajectory means the channel is still in early adoption. The brands whose sites are agent-ready now are capturing disproportionate share before the majority of their competitors have asked the question.

The holiday season data frames the scale of what is already at stake. AI and agents influenced 20% of global online orders during the 2025 holiday season, fueling approximately $262 billion in sales, according to Salesforce’s post-holiday analysis, which draws on activity data from over 1.5 billion global shoppers across Salesforce’s commerce and service platforms. This was not an experiment. It was a new channel running at infrastructure scale.

Where agents currently spend time on sites points to where agentic search optimization pays off first. According to HUMAN Security’s 2026 report, 77% of agentic AI activity occurs on product and search pages. Only 2.3% currently hit checkout pages, which signals that purchase completion is the frontier rather than the floor.

The brands whose checkout paths are agent-navigable today are positioned to capture the transaction volume that competitors are still debating whether to build for.

The B2B objection worth addressing directly: “Our site is SaaS. We don’t need agentic commerce features” is a 2024 frame. We’re in mid-2026, and the buyer journey for B2B software increasingly runs through AI-assisted research. An agent tasked with “find the top three analytics platforms for a mid-market e-commerce team and summarize their key differences” visits your site as part of that evaluation.

If your site returns incomplete, unstructured, or inaccessible content, you do not make the comparison. The agent moves to the competitor whose site answered the machine’s questions. The lead never reaches your sales team.

The commercial framing: agentic search optimization is an investment in channel capacity before the channel is congested.

How to measure your agentic search performance

Measuring agentic search performance is the part most teams get wrong first. The mistake is treating AI referral traffic as the headline metric when it is actually the smallest and most lagging signal. There are three distinct signal types, each requiring separate tools and interpretation.

Signal type 1: AI crawler activity

These are the indexing bots: GPTBot, ClaudeBot, PerplexityBot, and Google-Extended. They crawl your site to index content for their models. Monitoring them tells you which AI platforms are indexing your site, how frequently, and which pages they prioritize.

Check your server logs or CDN analytics for these user agents.

Signal type 2: AI referral traffic

These are sessions where a human received an AI-generated recommendation and clicked through to your site. ChatGPT, Perplexity, and Gemini appear as referral sources in modern analytics platforms.

Similarweb AI Traffic Analytics shows which AI chatbots send traffic to your specific pages, connecting your llms.txt and readability work to real-world referral behavior.

Signal type 3: Autonomous agent sessions

These are sessions where an AI agent browses your site on behalf of a user, often without a human ever seeing the pages it visits. They look like human sessions in traditional analytics but exhibit distinctive patterns: fast, sequential page visits, programmatic form interactions, and navigation that follows the document structure rather than the visual layout.

Traditional analytics was not built to distinguish these. Specialized tooling explicitly detects and segments them.

The three metrics to track from day one:

Metric	What it tells you	Source
AI crawler coverage by bot type	Which platforms are indexing your site. Which pages they hit	Server logs, CDN analytics
AI referral traffic share and landing pages	Sessions from ChatGPT, Perplexity, Gemini	Similarweb AI Search Intelligence
AI citation visibility	How often your brand appears in LLM responses for tracked queries	Similarweb AI Search Intelligence

According to Similarweb’s 2026 AI Brand Visibility Index, while AI platform visits continue to grow, referral clicks from those platforms have plateaued since mid-2025. That decoupling is the clearest signal that brand mention share, not referral traffic volume, is the leading indicator to track. For full platform-level growth data, see Similarweb’s Generative AI Statistics.

Similarweb keyword data from April 2026 shows “best crm software” generating 2,030 monthly US searches with a 69% zero-click rate and an active AI Overview, while “crm for small business” generates 605 monthly US searches with a 67% zero-click rate, also with an active AI Overview. A CRM brand tracking these two terms would primarily optimize for citation capture on both and would need a third term with a low zero-click rate to measure actual click-through traffic.

The same three-signal logic applies to any category: identify which of your tracked queries carry AI Overviews (citation plays), which do not (click-traffic plays), and measure each separately.

Agentic search optimization audit: the 21-point checklist

When I audit a site for agentic search readiness, I use this checklist. It covers all four CARE layers. Each check is binary: pass or fail. A score below 13/21 means meaningful agentic search traffic is currently bypassing your site.

You can also copy the full checklist as a Google Sheet, with checkboxes, live score tracking, and direct links to every resource, and run it against your own site: Copy the CARE Checklist

Layer 0: Crawlability (6 checks)

#	Check	Pass criteria
1	GPTBot: no DISALLOW rule	Not named under any Disallow in robots.txt, not caught by wildcard
2	PerplexityBot: no DISALLOW rule	Same
3	ClaudeBot: no DISALLOW rule	Same
4	Google-Extended: no DISALLOW rule	Same (check separately if site opted out of AI training)
5	Cloudflare AI bot blocking disabled	Bot Fight Mode and AI Crawl Control reviewed in dashboard
6	Server-side rendering confirmed	Pricing, specs, features present in HTML source, not JS-only

Layer 1: Accessibility (4 checks)

#	Check	Pass criteria
7	llms.txt exists and returns 200	File present at domain root, no server error (404 = N/A, 500 = active failure)
8	llms.txt URLs are high-value pages	Pricing, specs, FAQ, API docs (not homepage only)
9	XML sitemap is current and accurate	All key pages present. No noindex pages included
10	Sitemap submitted to major search platforms	Submitted via Search Console and Bing Webmaster Tools

Layer 2: Readability (7 checks)

#	Check	Pass criteria
11	H1-H3 heading hierarchy is correct	No skipped levels. No headings used purely for visual styling
12	Buttons have meaningful labels	All buttons have visible text or explicit aria-label
13	Form fields have explicit labels	No placeholder-only labeling
14	Schema.org Organization markup present	Name, URL, description, contact info all present
15	Schema.org Product, HowTo, or Article markup present	Applies to product/pricing pages and instructional or editorial content
16	Key content is not PDF-gated	Technical specs, pricing, integrations available as HTML
17	Pricing page has explicit, parseable structure	Named tiers, features, and prices as readable text

Layer 3: Executability (4 checks)

#	Check	Pass criteria
18	Primary CTAs have semantic button markup	type=”submit” or role=”button” plus descriptive aria-label
19	Schema.org Action markup on primary CTAs	BookAction, BuyAction, or ReserveAction on demo and trial CTAs
20	WebMCP status assessed	At minimum, the team is aware of the standard and implementation is on the roadmap
21	Authentication supports OAuth 2.0 (RFC 9728)	RFC 9728-compatible flow exists for protected account access

Free tools to run before touching a single file:

Cloudflare’s agent readiness scanner at isitagentready.com, launched in April 2026
AgentGrade site assessment, which tracks five-level readiness scoring
Chrome Lighthouse agentic browsing audits: covers llms.txt presence, three WebMCP checks, accessibility for agents, and layout stability, each mapped directly to a layer of the CARE framework

For context on what these scores look like in practice: in an April 2026 analysis of 62 Italian domains, not one scored above 70 out of 100, with an average of 38 (Claudio Novaglio). The consistent failure points were endpoint discoverability, machine-readable authentication, and content negotiation.

In my own audits, I see the same three breaking points. All three are configuration and markup fixes that any dev can implement in a single sprint, with no architectural changes or engineering overhaul required. That gap is the distance between where most sites currently are and where agents need them to be.

The agents are already here. Is your site built to receive them?

The transition from human visitors to delegated AI sessions is the most structurally significant change to web traffic since mobile. I have been tracking this shift since the first llms.txt implementations in early 2025, and the pace of change in the first half of 2026 has outrun most teams’ planning cycles.

The infrastructure is being defined right now, while many SEO teams are still treating “AI optimization” as a content project.

GEO got you cited. Winning in agentic search requires the next layer: making your site actually usable by the agents your prospects are already deploying. The gap between citation and usability is where business is lost before the human buyer even knows they were considering you.

The CARE framework gives you a sequenced path through that gap.

Layer 0 (Crawlability): Check your robots.txt for DISALLOW rules that block AI bots, and review your Cloudflare AI bot-blocking settings. These are two separate problems requiring two separate fixes, and clearing both takes less than an hour.
Layer 1 (Accessibility): Publish your llms.txt correctly (a broken file is worse than no file) and verify sitemap coverage.
Layer 2 (Readability): Run your schema audit and surface any content that lives in PDFs or behind forms.
Layer 3 (Executability): Start the WebMCP conversation with your dev team before competitors realize it is a positioning move rather than just a technical one.

The brands that treat agentic search optimization as a marketing priority, rather than a dev backlog item, are the ones capturing the channel before it gets competitive. The agents are already here. The only question is whether your site is built to receive them.

To see which agents are already visiting your site, which pages they hit, and how your AI referral share compares to competitors in your category, Similarweb AI Search Intelligence gives you the visibility layer that most teams are currently operating without.

Start Optimizing For Agentic Search

Run your first AI crawlability audit with Similarweb.

Try Similarweb free

FAQ

What is agentic search optimization?

Agentic search optimization, also described as how to do SEO for AI agents, is the practice of making a website technically accessible and usable by autonomous AI agents operating on behalf of human users. It is distinct from GEO (citation optimization) and AEO (answer engine optimization) because its success metric is agent task completion, not citation frequency or SERP visibility. Where GEO asks, “Will an AI mention us?”, agentic search optimization asks, “Can an AI actually use our site to complete a task for the user who sent it?”

How is agentic search optimization different from GEO and AEO?

GEO optimizes for being cited in AI-generated answers: the goal is to appear in the response a user reads. AEO optimizes for being selected as a direct answer source in AI Overviews and featured snippets. Agentic search optimization addresses what happens after citation: whether the agent that follows a recommendation can navigate your site, extract what it needs, and complete a task on the user’s behalf. A site can earn consistent GEO citations and still fail agentic search entirely if it blocks crawlers, hides content behind forms, or lacks semantic structure that agents can act on.

What is delegated AI traffic?

Delegated AI traffic is web traffic generated by autonomous AI agents operating on a user’s behalf rather than by the user browsing directly. When a user asks an AI assistant to “find the best CRM for a 50-person sales team and book a demo with the top option,” the agent visits websites, extracts information, and attempts to complete the task without the user having to navigate manually. This traffic type grew 7,851% year-over-year in 2025, according to HUMAN Security’s 2026 State of AI Traffic report, and now represents the fastest-growing segment of automated web traffic.

What actually breaks when an AI agent visits an unprepared site?

Real-world agent testing reveals consistent failure patterns. A May 2026 audit of 10 mid-market e-commerce sites using a Comet-based shopping agent found the same three breaking points repeatedly: cookie banners blocking the viewport before the agent could read page content, add-to-cart buttons requiring a login that the agent could not complete, and inventory data in the HTML that did not match actual stock levels. The agent recommended the brand, the user confirmed the purchase, and the transaction failed at execution. The same pattern applies to B2B sites: agents who cannot find pricing, parse gated specs, or locate a demo booking path move to the next option without notifying the user.

How long before agentic search optimization shows results?

Technical Layer 0 and Layer 1 fixes (Crawlability: robots.txt and Cloudflare. Accessibility: llms.txt) take effect as soon as AI crawlers’ next visit your site, typically within days to weeks. Layer 2 (Readability) schema improvements affect citation frequency, with measurable impact appearing within 4 to 8 weeks. Layer 3 actionability improvements affect agent task completion rates. The window that matters most is not the measurement timeline: it is the competitive window. As of May 2026, fewer than 5% of sites have implemented these changes. That gap closes faster than SEO authority gaps did.

What does an AI agent actually do when it visits your site?

An AI agent visiting your site reads the accessibility tree rather than the visual layout, following the same semantic structure a screen reader uses. Google’s official web.dev guidance confirms agents operate across three modalities: accessibility tree, raw HTML, and screenshots, with screenshots as the fallback because they are computationally expensive. The agent extracts headings, links, buttons, form labels, and ARIA roles. It does not see your hero image, does not read CSS, and does not hover over navigation. A UC Berkeley and University of Michigan study (arXiv:2602.09310), presented at CHI 2026, found that agent task success dropped from 80% under normal conditions to 42% when keyboard-only navigation was simulated, directly linking accessibility quality to agent performance.

Does JavaScript rendering affect agentic search performance?

Yes, significantly. Agents that use accessibility-tree parsing cannot read content that is only present in the JavaScript bundle and not in the HTML source. Pricing tables, feature lists, product specifications, and CTAs rendered client-side are invisible to these agents, regardless of robots.txt permissions. The fix is server-side rendering for all evaluation-critical content. Run the diagnostic by disabling JavaScript in Chrome DevTools and loading your five most important pages. Any content that disappears is currently invisible to a significant segment of AI agents.

Do I need a developer to optimize for agentic search?

Layer 0 (Crawlability) requires checking robots.txt and the Cloudflare dashboard: both are configuration tasks, not code changes. Layer 1 (Accessibility) means adding a static llms.txt file and verifying sitemap coverage: content-level work. Layer 2 (Readability) requires schema markup and HTML review, which involves developer time but not architectural changes. Layer 3 (Executability) ranges from ARIA label additions (one sprint) to a full WebMCP implementation (including JavaScript development). The pragmatic path: do Layers 0 and 1 this week, plan Layer 2 for the next sprint, and scope Layer 3 for the following quarter.

by Limor Barenholtz

Director of SEO & AI Search at Similarweb

Limor brings 20 years of expertise in SEO and AI Search. She thrives on solving complex problems, creating scalable strategies, and building amazing dashboards.