Sources · Bibliography

The research behind what we build.

Every tactic XALA applies comes from a paper, an official documentation, or a public benchmark. Here are the sources — with direct links to the original. If we say it on the site, it's listed below.

01 Concepts and metricsAcademic + industry

GEO: Generative Engine Optimization

Aggarwal et al. · Princeton + IIT Delhi · 2023 · arXiv:2311.09735

Measured 9 optimization tactics for generative engines. Three showed measurable lift in LLM visibility: citing sources (+40%), adding statistics, quoting experts. The paper reports up to +115% in combined visibility.

How XALA applies it: the three tactics are the foundation of how we build content. They appear on the home page, the Princeton section, and every FAQ.

Read the paper →

Zero-click search study — Similarweb / SparkToro

Similarweb / SparkToro · July 2025

58.5% of Google searches in the US end without a single click. When the search triggers an AI Overview, the zero-click rate climbs to 69%; on AI-Overview-triggered queries, it reaches 83%.

How XALA applies it: justifies the shift from SEO (where the click was the conversion) to GEO (where the mention inside the answer can be the conversion). Used on the home, FAQ, and journal.

Read the study →

E-E-A-T: Experience, Expertise, Authority, Trust

Google Search Central · Quality Rater Guidelines

The framework Google uses to assess quality. Generative models (including AI Overviews and Gemini) inherit it: sources with strong E-E-A-T get cited more.

How XALA applies it: every page has identifiable author, update date, external citations, and demonstrable mechanisms — no unbacked claims.

Quality Rater Guidelines (PDF) →

02 AI bots and crawlersOfficial platform docs

OpenAI Crawlers — GPTBot, ChatGPT-User, OAI-SearchBot

platform.openai.com/docs/bots

Three distinct bots: GPTBot (training), ChatGPT-User (live browsing in responses), OAI-SearchBot (SearchGPT and results). Each respects robots.txt independently.

How XALA applies it: the AI Readiness Check validates all three separately. Blocking one doesn't imply blocking the others.

Official documentation →

Anthropic ClaudeBot

Anthropic Privacy Center

ClaudeBot is Anthropic's main crawler. Identifies via claudebot@anthropic.com and respects robots.txt. Blocking it means dropping out of Claude's future training.

How XALA applies it: the AI Readiness Check verifies ClaudeBot has explicit access to your site.

View documentation →

Perplexity Crawlers — PerplexityBot, Perplexity-User

docs.perplexity.ai

Two crawlers: PerplexityBot indexes the site, Perplexity-User fetches in real time when someone runs a search. Both respect robots.txt.

How XALA applies it: we validate both in the AI Readiness Check. Perplexity is the fastest LLM to start citing new content (30-60 days).

Official documentation →

Google-Extended — Gemini and AI Overviews

Google Search Central

Google-Extended controls whether Google uses your site for Gemini training or AI Overview citations. Independent of Googlebot (the traditional search crawler).

How XALA applies it: blocking Googlebot affects your SEO; blocking Google-Extended affects your AI Overview visibility. They're separate decisions.

Official crawler list →

03 Standards and filesOpen standards

Schema.org — Structured data

W3C · Initiative founded by Google, Microsoft, Yahoo, Yandex

The standard vocabulary for marking up data on your site so machines (search engines and LLMs) understand it. Relevant types for our verticals: MedicalClinic, RealEstateAgent, AggregateRating, FAQPage.

How XALA applies it: every site we build ships JSON-LD with the correct types for its vertical. The AI Readiness Check verifies presence and type.

schema.org →

llms.txt — Emerging standard

Howard / Answer.AI · llmstxt.org

Proposal by Jeremy Howard (Answer.AI) for sites to indicate to LLMs which content to prioritize. Adopted by Anthropic, Mintlify, Cloudflare and others.

How XALA applies it: every site we build ships llms.txt at the root. The AI Readiness Check verifies its existence.

llmstxt.org →

robots.txt — Robots Exclusion Standard

IETF · RFC 9309 · 2022

The file that tells crawlers which parts of your site they can visit. 1994 standard formalized in 2022 as RFC 9309. AI crawlers (GPTBot, ClaudeBot, PerplexityBot) respect it.

How XALA applies it: the AI Readiness Check parses your robots.txt and tests 14 AI bots separately.

RFC 9309 →

Missing a source?

If XALA makes a claim on the site that you can't find here, email us — the source should be listed.

Tell us → See glossary →