What On-Page Signals Actually Differentiate AI-Cited Pages
We analyzed 18,129 pages cited by AI and 4,622 control pages, extracting 213 structural signals from each.
The 10 Scoring Composites, Ranked
Crawl & Index Signals leads at 14.8% weight (d=0.158), followed by RAG Retrieval Suitability at 13.0%. The distribution is remarkably flat — no single dimension dominates.
Strongest Individual Signals
Lexical diversity (d=0.268) is the strongest non-behavioral signal. Cited pages use more varied vocabulary, have shorter paragraphs, and are more likely to have basic HTML hygiene.