That is Half 2 in a five-part sequence on optimizing web sites for the agentic internet. Half 1 coated the evolution from search engine optimisation to AAIO and why the shift issues. This text will get sensible: how AI methods truly choose content material, and what you are able to do about it.
AI Doesn’t Rank Pages. It Selects Fragments.
Conventional search ranks entire pages. AI search does one thing essentially totally different.
Microsoft’s Krishna Madhavan, principal product supervisor on the Bing workforce, described the shift in October 2025: AI assistants “break content material down, a course of known as parsing, into smaller, structured items that may be evaluated for authority and relevance. These items are then assembled into solutions, usually drawing from a number of sources to create a single, coherent response.”
That is the core perception. AI doesn’t decide the most effective web page and present it. It picks the most effective fragments from many pages and weaves them collectively. Your web page may rank No. 1 on Google and nonetheless not get cited in an AI response if its content material isn’t structured in fragments that AI can extract.
The numbers present the shift is actual. In response to the Conductor AEO/GEO Benchmarks Report (January 2026; 13,770 domains, 17 million AI responses), AI visitors now accounts for 1.08% of all web site periods, rising roughly 1% month over month. Microsoft reported that AI referrals to high web sites spiked 357% year-over-year in June 2025, reaching 1.13 billion visits. Small numbers right now, compounding quick.
One in 4 Google searches now triggers an AI Overview. In healthcare, it’s almost one in two. The floor space is rising, and the content material that fills these solutions has to come back from someplace. The query is whether or not it comes from you.
The Analysis: What Really Will get Cited
The tutorial analysis on what makes content material citable in AI responses has matured quickly. The foundational paper, “GEO: Generative Engine Optimization” (Princeton, IIT Delhi, Georgia Tech, revealed at KDD 2024), examined 9 optimization methods and located that GEO methods might increase visibility by as much as 40% in AI responses. The best single method was citing credible sources, which produced a 115.1% visibility improve for web sites that weren’t already rating within the high positions.
A counterintuitive discovering: Writing in an authoritative or persuasive tone didn’t enhance AI visibility. AI methods don’t reply to rhetorical fashion. They reply to verifiable info.
Since then, 2025 introduced a wave of follow-up analysis that examined these concepts on actual manufacturing AI engines fairly than simulated ones.
The College of Toronto examine (September 2025) was the primary large-scale evaluation throughout ChatGPT, Perplexity, Gemini, and Claude. Their most putting discovering: AI search overwhelmingly favors earned media. In client electronics, AI cited third-party authoritative sources 92.1% of the time, in comparison with Google’s 54.1%. Automotive confirmed an identical sample at 81.9% versus 45.1%. In different phrases, it’s not simply the way you write content material, however whose area it seems on. Press protection, product opinions on unbiased web sites, and mentions on trade publications carry way more weight in AI responses than your personal web site.
Carnegie Mellon’s AutoGEO examine (October 2025) used automated strategies to find what generative engines truly want. The outcomes confirmed as much as 50.99% enchancment over the most effective baseline, with common preferences rising throughout engines: complete subject protection, factual accuracy with citations, clear logical construction with headings and lists, and direct solutions to queries.
The GEO-16 framework (September 2025) analyzed 1,702 actual citations from Courageous, Google AI Overviews, and Perplexity. It recognized 16 on-page high quality elements that predict quotation chance. The highest three: metadata and freshness, semantic HTML, and structured knowledge. Technical on-page elements matter as a lot as the standard of the writing itself.
And a actuality test from Columbia and MIT’s ecommerce examine (November 2025): of 15 widespread content material rewriting heuristics, 10 produced negligible or unfavourable outcomes. The optimization methods that did work converged towards truthfulness, consumer intent alignment, and aggressive differentiation. Not methods. Substance.
The general sample throughout all of this analysis: AI methods reward readability, factual accuracy, and construction. They don’t reward advertising and marketing language, persuasion techniques, or key phrase density.
Content material Construction That Earns Citations
Based mostly on the analysis and official steering from Microsoft and Google, right here’s what structurally makes content material citable.
Heading hierarchy issues greater than ever. Use descriptive H2 and H3 headings that every cowl one particular concept. Microsoft lists robust headings as “alerts that assist AI know the place an entire concept begins and ends.” Imprecise headings like “Be taught Extra” or “Overview” give AI nothing to work with. A heading like “How AI parses content material in another way than engines like google” tells the system precisely what the part accommodates.
Q&A format is native to AI. Write questions as headings with direct solutions beneath them. Microsoft notes that “assistants can usually carry these pairs phrase for phrase into AI-generated responses.” In case your content material solutions the query somebody asks an AI, and it’s structured as a transparent question-and-answer pair, you’ve made the AI’s job simple.
Make content material snippable. Bulleted and numbered lists, comparability tables, step-by-step directions. These codecs give AI clear, extractable fragments. A paragraph buried in a wall of textual content is more durable for AI to isolate than the identical info offered as a three-item record.
Entrance-load the reply. Begin sections with the important thing info, then present context. If somebody asks, “What temperature ought to I bake bread at?” and your content material opens with a two-paragraph historical past of bread making earlier than mentioning 375°F, you’ll lose the quotation to a competitor who leads with the reply.
Hold sections self-contained. Every part ought to make sense by itself, with out requiring the reader to have learn the earlier part. AI extracts fragments. In case your fragment solely is sensible within the context of the entire web page, it received’t be chosen.
An necessary technical observe from Microsoft: “Don’t disguise necessary solutions in tabs or expandable menus: AI methods might not render hidden content material, so key particulars may be skipped.” FAQ solutions collapsed inside an expandable menu, product specs hidden behind tabs, content material that requires interplay to disclose: it might all be invisible to AI. If info is necessary, it must be within the seen HTML.
Authority Alerts For AI
E-E-A-T (Expertise, Experience, Authoritativeness, Trustworthiness) isn’t only a Google idea anymore. It’s what AI methods search for throughout the board, even when they don’t use the time period.
Microsoft’s October 2025 steering describes the baseline: success begins with content material that’s “contemporary, authoritative, structured, and semantically clear.” On the readability facet, they’re particular: “keep away from imprecise language. Phrases like progressive or eco imply little with out specifics. As an alternative, anchor claims in measurable details.” Saying one thing is “next-gen” or “cutting-edge” with out context leaves AI uncertain tips on how to classify it.
The analysis backs this up. The unique GEO paper discovered that writing in a persuasive or authoritative tone didn’t enhance AI visibility. Details and cited sources did. Advertising and marketing language doesn’t impress algorithms.
This connects to the College of Toronto’s discovering about earned media dominance. AI methods belief third-party validation greater than self-promotion. In client electronics, AI cited third-party authoritative sources 92.1% of the time in comparison with Google’s 54.1%. The implication: getting your experience revealed on trade web sites, incomes press protection, and constructing a presence on authoritative platforms issues extra for AI visibility than perfecting the copy by yourself website.
Freshness is a sign, not a bonus. Stale content material not often will get cited. Krishna Madhavan mentioned at Pubcon Cyber Week: “Stale or lacking content material will constrain the quantity of retrieval we are able to do and push brokers towards various sources.”
Schema Markup: From Textual content To Data
Microsoft’s October 2025 put up devotes a whole part to schema. They describe it as code that “turns plain textual content into structured knowledge that machines can interpret with confidence.” Schema can label your content material as a product, overview, FAQ, or occasion, giving AI methods express context as an alternative of forcing them to guess. Krishna Madhavan bolstered this at Pubcon: “Schemas are tremendous helpful. They assist the system discern precisely what your info is with out us having to guess.”
The GEO-16 framework confirms this from the tutorial facet. Structured knowledge was one of many high three elements predicting AI quotation chance, alongside metadata/freshness and semantic HTML.
The schema varieties that matter most for AI visibility:
- FAQPage for question-and-answer content material (instantly maps to how AI codecs responses).
- HowTo for step-by-step directions.
- Product with Supply, AggregateRating, and Evaluation for ecommerce.
- Article/BlogPosting for content material with clear authorship and dates.
- Group for enterprise identification.
Pair structured knowledge with IndexNow for freshness. Because the Bing Webmaster Weblog put it: “IndexNow tells engines like google that one thing has modified, whereas structured knowledge tells them what has modified. Collectively, they enhance each velocity and accuracy in indexing.”
Crawler Permissions: Who Will get In
AI engines like google use distinct crawlers, and most allow you to management coaching and search entry individually. Right here’s who to permit.
| Bot | Platform | Function | Robots.txt Token |
|---|---|---|---|
| OAI-SearchBot | ChatGPT | Search index | OAI-SearchBot |
| GPTBot | OpenAI | Mannequin coaching | GPTBot |
| ChatGPT-Consumer | ChatGPT | On-demand looking | ChatGPT-Consumer |
| Bingbot | Microsoft Copilot | Search + AI | Bingbot |
| Googlebot | Google AI Overviews | Search + AI | Googlebot |
| Google-Prolonged | Gemini coaching | Google-Prolonged |
|
| PerplexityBot | Perplexity | Search + index | PerplexityBot |
| Perplexity-Consumer | Perplexity | On-demand looking | Perplexity-Consumer |
| ClaudeBot | Anthropic | Coaching + retrieval | ClaudeBot |
A smart robots.txt configuration may permit search crawlers whereas blocking coaching:
Consumer-agent: OAI-SearchBot
Enable: /
Consumer-agent: ChatGPT-Consumer
Enable: /
Consumer-agent: GPTBot
Disallow: /
Consumer-agent: Google-Prolonged
Disallow: /
OpenAI supplies the cleanest bot separation. You may permit OAI-SearchBot (so your content material seems in ChatGPT search) whereas blocking GPTBot (so it’s not used for mannequin coaching). Google’s controls are much less granular: blocking Google-Prolonged prevents Gemini coaching however has no impact on AI Overviews, which use Googlebot.
OpenAI additionally presents the most particular technical suggestion of any AI search supplier. For his or her Atlas browser (which makes use of a typical Chrome consumer agent, not a bot identifier), they suggest following WAI-ARIA greatest practices: “Add descriptive roles, labels, and states to interactive components like buttons, menus, and kinds. This helps ChatGPT acknowledge what every factor does and work together along with your website extra precisely.” Accessibility and AI agent compatibility are the identical work.
A caveat on Perplexity: whereas their documentation states they respect robots.txt, Cloudflare documented in August 2025 that Perplexity makes use of undeclared crawlers with rotating IPs and spoofed browser consumer brokers to bypass no-crawl directives. This can be a contested declare, but it surely’s price figuring out.
For income, Perplexity is the one platform at present providing writer compensation. Their Comet Plus program supplies an 80/20 income cut up (publishers maintain 80%) throughout direct visits, search citations, and agent actions.
Google Vs. Microsoft: Two Philosophies
The distinction between Google and Microsoft on AEO is putting sufficient to be its personal story.
Google says: simply do good search engine optimisation. Their official documentation is intentionally minimalist: “There aren’t any extra necessities to seem in AI Overviews or AI Mode, nor different particular optimizations essential.” They add that you simply “don’t must create new machine readable information, AI textual content information, or markup to seem in these options.”
Google recommends useful, dependable, people-first content material demonstrating E-E-A-T. Normal structured knowledge. Good web page expertise. Technical fundamentals. Nothing AI-specific.
Microsoft says: right here’s the playbook. Their October 2025 weblog put up and January 2026 information present detailed, actionable steering. Particular heading constructions. Schema suggestions. Content material formatting guidelines. Concrete examples (an AEO product description vs. a GEO product description). Warnings about content material hidden in tabs and expandable menus. A framework for eager about crawled knowledge, product feeds, and stay web site knowledge as three distinct layers.
What explains the distinction? Partly market place. Google dominates search and has much less incentive to assist publishers optimize for AI options that may scale back clicks to their web sites. Microsoft, with Bing’s roughly 8% market share, advantages from offering publishers with causes to optimize particularly for his or her ecosystem.
However there’s a sensible takeaway: Microsoft’s steering isn’t Bing-specific. The rules of structured content material, clear headings, snippable codecs, schema markup, and professional authority are common. Following Microsoft’s playbook improves your content material for each AI system, together with Google’s. Google simply received’t inform you that.
Measuring AI Visibility
That is the laborious half. Conventional search engine optimisation has Google Search Console. AI visibility remains to be fragmented.
Ahrefs analyzed 1.9 million citations from 1 million AI Overviews and located that 76% of citations come from pages already rating in Google’s high 10. The median rating for the most-cited URLs was place 2. Conventional rating nonetheless issues for AI quotation, however being No. 1 is “a coin flip at greatest” for getting cited.
The visitors influence is critical. Ahrefs discovered that AI Overviews correlate with 58% decrease click-through charges for the No. 1 place. Seer Interactive reported a 61% natural CTR drop for queries with AI Overviews. However being cited inside the AI Overview offers 35% extra natural clicks in comparison with not being cited. Quotation is the brand new rating.
For monitoring, the device panorama is rising:
| Device | What It Tracks | Beginning Worth |
|---|---|---|
| Profound | Citations throughout ChatGPT, Perplexity, Copilot, Google AIOs | From $99/mo |
| Peec.ai | Model mentions throughout ChatGPT, Gemini, Claude, Perplexity | From ~$95/mo |
| Superior Net Rating | AIO presence monitoring in Google | Included in plans |
| Bing Webmaster Instruments | AI Efficiency Report for Copilot | Free |
Bing Webmaster Instruments is the best place to begin. It’s free, and the brand new AI Efficiency Report exhibits how your content material performs in Copilot citations. For ChatGPT particularly, observe utm_source=chatgpt.com in your analytics. OpenAI robotically appends this to referral URLs.
Conductor’s January 2026 report discovered that 87.4% of AI referral visitors comes from ChatGPT. That’s one platform dominating the area, which makes monitoring it notably necessary.
Key Takeaways
- AI selects fragments, not pages. Construction your content material in self-contained, extractable sections with descriptive headings that sign the place every concept begins and ends.
- Readability beats persuasion. Factual accuracy, cited sources, and direct solutions outperform authoritative tone and advertising and marketing language. The analysis persistently exhibits this.
- Earned media dominates model content material in AI citations. Press protection, third-party opinions, and authoritative mentions on different web sites carry extra weight than your personal pages. Construct presence past your area.
- Schema markup is a drive multiplier. FAQPage, HowTo, Product, and Article schemas make your content material machine-readable. Pair with IndexNow for freshness.
- Comply with Microsoft’s playbook, even for Google. Google says “simply do good search engine optimisation.” Microsoft supplies particular, actionable steering that improves content material for each AI system, Google’s included.
- Separate coaching from search in your robots.txt. Enable search crawlers (OAI-SearchBot, Bingbot, PerplexityBot) whereas blocking coaching crawlers (GPTBot, Google-Prolonged) if that’s your choice. You’ve extra management than you may suppose.
- Monitor AI visibility now. Use Bing Webmaster Instruments (free), monitor
utm_source=chatgpt.comin analytics, and take into account devoted instruments because the measurement area matures.
Conventional search engine optimisation requested: “How do I rank?” AEO asks: “How do I grow to be the fragment that will get chosen?” The reply isn’t a single trick. It’s clear construction, verifiable experience, and content material that AI can confidently extract and cite.
Up subsequent in Half 3: the protocols powering the agentic internet, together with MCP, A2A, NLWeb, and AGENTS.md, and the way they match collectively.
Extra Assets:
This was initially revealed on No Hacks.
Featured Picture: Meepian Graphic/Shutterstock









