How 5 AI Platforms Cite Differently
By Andrew Coffey · 2026-03-24
One of the most consistent findings across AI citation research is that each platform cites differently. Ahrefs found that 86% of the top 50 most-cited domains were unique to a single AI surface. Search Atlas's 5.5M response study confirmed citation patterns are platform-specific. We can add something those studies couldn't: exactly how each model's page-level preferences differ, measured from the same dataset with the same signals and the same controls.
Citation Volume: Not All Models Are Equal
Each model produces citations at dramatically different rates from the same prompts: Anthropic (Claude) 99.0% citation rate, 14.32 avg citations/prompt, 6,694 unique cited URLs. Gemini 98.2%, 13.92 avg, 6,815 URLs. Google AI Overviews 80.8%, 8.76 avg, 4,369 URLs. Perplexity 100.0%, 7.62 avg, 3,740 URLs. OpenAI (ChatGPT) 96.0%, 5.62 avg, 2,075 URLs. Claude and Gemini are the most generous citers — both averaging 14 citations per prompt. ChatGPT is the most selective at 5.62. Google AI Overviews doesn't trigger at all for about 20% of prompts. Perplexity's 100% rate is architectural — it always searches the web. ChatGPT's lower rate reflects its tendency to answer from memory for some query types. Profound (June 2025) found that users' opening questions trigger web searches but follow-ups rarely do.
Each Model's Top-Cited Sources
The source preferences diverge sharply. ChatGPT: apnews.com (109), en.wikipedia.org (93), google.com (80), axios.com (50), techradar.com (42) — heavy news and reference orientation. Claude: pmc.ncbi.nlm.nih.gov (150), irs.gov (96), medium.com (66), yelp.com (59), angi.com (56) — strong academic, government, and directory preference. Gemini: youtube.com (163), medium.com (149), pmc.ncbi.nlm.nih.gov (91), reddit.com (75), dev.to (60) — YouTube-heavy, community platform orientation. Perplexity: youtube.com (209), irs.gov (66), pmc.ncbi.nlm.nih.gov (63), angi.com (38), homeadvisor.com (34) — most YouTube-dependent of any model. Google AIO: youtube.com (155), pmc.ncbi.nlm.nih.gov (126), medium.com (99), reddit.com (64), irs.gov (43) — academic and platform-heavy.
Per-Model Composite Weights
Here's where it gets actionable. We computed separate composite weights for each model — measuring which page-level factors each platform's cited pages score higher on vs controls.
ChatGPT (1,806 cited pages)
Page Freshness 21.6% (d=0.133), Structured Metadata 20.1% (d=0.123), Crawl & Index Signals 14.9% (d=0.091), Multimodal Readiness 14.3% (d=0.087), Engagement Cues 12.3% (d=0.075). ChatGPT is the outlier. It's the only model that puts Page Freshness and Structured Metadata at the top — no other model weights either above 10%. If you're specifically optimizing for ChatGPT, keep your content dates current and your schema markup thorough. This aligns with ChatGPT's known preference for recent, well-structured news and reference content.
Claude (6,646 cited pages)
RAG Retrieval Suitability 14.7% (d=0.126), Crawl & Index Signals 14.4% (d=0.124), Content Relevance 13.9% (d=0.120), Citation Suitability 12.8% (d=0.110), AI Readability 11.6% (d=0.100). Claude has the most even distribution. Seven of 10 metrics fall between 11-15%. There's no single dimension Claude strongly favors, which means the best Claude optimization strategy is broad improvement across all factors. Claude also values Citation Suitability (statistics, author info, dates) more than any other model.
Gemini (6,540 cited pages)
Crawl & Index Signals 15.9% (d=0.304), Content Relevance 13.6% (d=0.257), Engagement Cues 12.8% (d=0.242), RAG Retrieval Suitability 11.6% (d=0.219), Multimodal Readiness 10.9% (d=0.207). Gemini produces the strongest effects of any model — top composite d=0.304, nearly double any other model's top effect. Gemini strongly favors crawlability and content depth. This makes sense: Gemini is the only major model that renders JavaScript, so it sees more page content and differentiates more strongly on structural quality. Gemini is the model where page optimization produces the largest measurable lift.
Perplexity (3,516 cited pages)
Crawl & Index Signals 15.7% (d=0.221), Engagement Cues 13.2% (d=0.184), RAG Retrieval Suitability 12.9% (d=0.180), Content Relevance 12.4% (d=0.173), Multimodal Readiness 12.2% (d=0.170). Perplexity most closely resembles the cross-model average. As a pure RAG system (mandatory web search on every query), it makes sense that RAG Retrieval Suitability — clean, chunked, semantically structured content — is among its top priorities.
Google AI Overviews (4,007 cited pages)
Crawl & Index Signals 12.8% (d=0.191), RAG Retrieval Suitability 12.4% (d=0.184), Engagement Cues 12.4% (d=0.184), Content Relevance 12.0% (d=0.178), Domain Expertise 9.9% (d=0.148). Google AIO is the most balanced model. Its top metric is 12.8% and its bottom is 6.7% — the flattest distribution of any platform. No single factor dominates. Notably, Domain Expertise (9.9%) ranks higher here than for any other model — Google may lean more on author and organizational credibility signals than the others.
Cross-Model Consensus and Divergence
What every model agrees on: Crawl & Index Signals is the #1 or #2 metric for every model except ChatGPT. The basics — canonical tags, lang attributes, meta descriptions — consistently differentiate cited from non-cited pages regardless of platform. Where models diverge most: Page Freshness strongest on ChatGPT (21.6%), weakest on Claude (3.7%). Structured Metadata strongest on ChatGPT (20.1%), weakest on Claude (3.0%). Citation Suitability strongest on Claude (12.8%), weakest on Gemini (4.5%). Domain Expertise strongest on Google AIO (9.9%), weakest on Claude (3.6%). ChatGPT is the freshness and metadata model. Claude is the citation quality model. Google AIO is the domain expertise model. Gemini is the crawlability model. Perplexity is the generalist.
Why We Use Volume-Weighted Scoring
With 5 models producing different weight sets, how do you score a page? We use volume-weighted cross-model analysis: each model's contribution to the overall weights is proportional to its cited page count. Gemini (6,815 cited URLs) has more influence than ChatGPT (2,075) because its larger sample gives it more statistical power. This produces the most stable, broadly applicable weight set. It doesn't optimize for any single model — it optimizes for the broadest possible citation surface. That's what Indexably's page scores use. If you care about a specific platform, the per-model weights tell you where to focus differently.
What To Do With This
If you want broad AI visibility across all platforms: Follow the priority order from Post 1 — fix HTML fundamentals first, then content structure, then metadata. The volume-weighted scoring captures the cross-model consensus. If ChatGPT specifically matters to you: Prioritize freshness above everything else. Keep dates current. Update content regularly. Invest in structured metadata (JSON-LD, Open Graph). ChatGPT weights these 2-3x more than any other model. If Google AI Overviews matters most: Be well-rounded. Google AIO doesn't strongly favor any single factor. Invest in domain expertise signals (author info, organization schema) slightly more than you would for other platforms. If you're a technical audience targeting Perplexity or Gemini: Invest in clean, semantically structured HTML. Both of these platforms value RAG-friendly content — clear headings, reasonable section lengths, semantic tags. Gemini especially rewards crawlability, and it's the model where page-level optimization produces the largest measurable lift.