Tristan Watson
Tristan Watson Founder · March 29, 2026 · 12 min read
AI SEO

How Perplexity Ranks and Cites Websites

Perplexity processes over 100M queries per month. Here is how it decides which websites to cite -- explained by a team that queries its API every day.

Why Perplexity SEO Is Different

Perplexity is not a search engine in the traditional sense. It does not return a list of ten blue links. It reads the web, synthesizes an answer, and attaches citations -- inline references to the sources it used. If your website is one of those citations, you get visibility. If it is not, you are invisible to that user.

This is a fundamentally different game from Google SEO. On Google, you optimize for ranking position. On Perplexity, you optimize for citation inclusion. There is no "page one." There is only "cited" or "not cited."

We know this from the inside. Our AI Citation Check feature queries Perplexity's Sonar API directly to track whether your website appears in AI-generated answers. We have run thousands of citation checks across hundreds of domains. What follows is what we have learned about how Perplexity selects its sources.


How Perplexity Discovers Content

Perplexity's answer engine has two layers: retrieval and generation. Understanding both is essential for optimization.

The retrieval layer

When you submit a query, Perplexity first retrieves candidate sources from its web index. This step is similar to traditional search -- it identifies pages that are topically relevant to the query. Perplexity's index overlaps heavily with what Google indexes, which means traditional SEO fundamentals still matter here. If Google cannot find your page, Perplexity probably cannot either.

But retrieval is not the whole story. Perplexity retrieves far more candidates than it ultimately cites. The real selection happens in the next layer.

The generation layer

Perplexity's language model reads the retrieved sources and generates a synthesized answer. During generation, it decides which sources to actually cite -- which pages contributed useful information to the final answer. This is where traditional SEO and Perplexity SEO diverge sharply.

A page can be retrieved but not cited. This happens when the page is topically relevant but does not contain a clear, extractable answer to the query. The model needs to be able to pull a specific fact, recommendation, or explanation from your content. If your page is vague, padded with filler, or buries its key points under walls of introductory text, it gets retrieved and discarded.


The 6 Signals That Drive Perplexity Citations

Based on our analysis of thousands of citation checks through the Sonar API, these are the signals that consistently separate cited sites from non-cited sites.

1

Direct answer density

Pages that state facts, figures, and conclusions explicitly get cited. Pages that hint, hedge, or require inference do not. Perplexity is looking for extractable statements -- sentences it can reference as a source for a specific claim in its answer.

2

Structural clarity

Content organized with clear headings, short paragraphs, and logical flow gets cited more often. The language model parses your page during generation. If it can quickly locate the relevant section, it cites you. If your content is a wall of text, it moves on to a source that is easier to extract from.

3

Topical authority signals

Sites that cover a topic comprehensively -- multiple pages on related subtopics, deep expertise, original data -- get cited over sites with a single shallow page. Perplexity's retrieval layer favors domains that demonstrate depth in a subject area.

4

Freshness and recency

For queries where recency matters (pricing, statistics, tool comparisons), Perplexity strongly favors recently updated content. We have seen pages lose citations within weeks of becoming outdated. Dates on your content matter.

5

Source diversity preference

Perplexity deliberately cites multiple sources per answer. It does not want to depend on a single domain. This means you are competing for one of 5-15 citation slots, not trying to be the only answer. Your content needs to be among the best sources, not the only source.

6

Machine-readable structure

Sites with llms.txt files, clean metadata, and structured data give the retrieval layer stronger signals about what the page contains. This is the technical foundation -- making it easy for AI to understand what your site is about before it even reads the content.


What Gets Cited vs What Gets Skipped

Here is a concrete breakdown from our citation data. These patterns hold across industries.

Gets CitedGets Skipped
"Our pricing starts at $49/mo for teams of 5-20""Contact us for a custom quote"
"The 3 most common causes of foundation cracks are...""Foundation problems can be caused by many things"
A comparison table with specific specs and pricesA page that says "we offer competitive pricing"
A blog post updated in 2026 with current statisticsA blog post from 2022 with outdated numbers
Clear H2 sections answering distinct questionsA 3,000-word page with no subheadings
Original research, case studies, proprietary dataRewritten content that exists on 50 other sites

The pattern is clear. Perplexity cites content that makes specific, extractable claims. It skips content that is vague, generic, or forces the model to guess.


How the Sonar API Works (And Why It Matters)

Perplexity offers the Sonar API -- the same retrieval-augmented generation engine that powers perplexity.ai -- as a developer product. This is what we use for our citation checks at llmstxt.studio. Here is what happens under the hood.

When we send a query to the Sonar API, it returns two things: a generated answer and a structured citations array -- an ordered list of URLs that the model used as sources. This is not a list of "related links." These are the specific pages the model read and referenced while constructing its answer.

The citations array is the ground truth. If your domain appears in it, the model used your content. If it does not, the model either did not retrieve your page or retrieved it and chose not to cite it.

We check this programmatically. For each site on our platform, we generate queries across three categories -- brand discovery (can AI find this business?), topic authority (does AI treat this site as a knowledge source?), and competitive landscape (who dominates the broader space?) -- and run each query through Sonar. We then parse the citations array, check for the user's domain, and record which competitors appeared instead.

This is not a proxy metric. It is the actual citation data from the same engine that serves perplexity.ai users.


Perplexity GEO: A Practical Playbook

Generative engine optimization for Perplexity comes down to five actions. These are ordered by impact.

1

Make your content extractable

Rewrite key pages so that every important claim is stated explicitly in a single sentence. If someone asked "what does [your company] charge?" the answer should be on your pricing page in a sentence that starts with your pricing. Do not make the model hunt for it.

2

Structure pages for machine reading

Use descriptive H2 headings that match common query patterns. "How much does foundation repair cost?" is a better heading than "Our Services." Each section should be independently useful -- a model should be able to extract value from one section without reading the whole page.

3

Deploy an llms.txt file

Give the retrieval layer a structured summary of your entire site. An llms.txt file lists your key pages with descriptions, making it trivially easy for AI to understand your site's scope and expertise. This is the equivalent of submitting a sitemap to Google -- a direct signal about what you offer.

4

Publish original, specific content

Perplexity cites sources that add unique value. If your blog post says the same thing as 20 other blog posts on the topic, you are competing for citation slots against all of them. Original data, case studies, unique analysis, and specific examples give you an edge.

5

Monitor and iterate

Run citation checks regularly to see if your changes are working. Track which queries cite you, which cite competitors, and which cite no one in your space. Use this data to identify gaps and prioritize content updates.


The Citation Position Myth

In Google SEO, position matters enormously. The difference between #1 and #5 is a massive drop in click-through rate. Perplexity citations work differently.

The Sonar API returns citations in a numbered array, and each citation is referenced inline in the answer text (e.g., [1], [2], [3]). But the position in the array does not correlate with visibility the way Google rankings do. A citation at position [7] can appear in the most important sentence of the answer, while [1] might support a throwaway introductory fact.

What matters is whether you are cited at all and in what context. Being cited as the source for the core recommendation is worth more than being cited first for a background detail. This is why we track citation presence, not citation rank, as the primary metric.


Perplexity vs Google: What to Optimize Differently

GooglePerplexity
Primary goalRank on page oneGet cited in the answer
Content lengthLonger often ranks betterConcise and extractable wins
KeywordsCritical for rankingHelpful for retrieval, not generation
BacklinksMajor ranking factorIndirect signal via domain authority
MetadataTitle tags and meta descriptionsllms.txt + structured data
FreshnessMatters for some queriesMatters for most queries
MeasurementSearch Console rankingsCitation checks via Sonar API

The key takeaway: Google rewards pages that match search intent. Perplexity rewards pages that answer questions directly. These often overlap, but not always. A page can rank #1 on Google by being comprehensive and well-linked. That same page can be skipped by Perplexity if it buries its answer under 800 words of context-setting.


Check Your Perplexity Citations

We built our AI Citation Check specifically for this. It queries Perplexity's Sonar API with prompts tailored to your business -- brand queries, topic queries, and competitive queries -- and shows you exactly who gets cited.

You see whether your domain appears, which competitors show up instead, and how your citation presence changes over time. No guessing. No manual spot checks. Actual citation data from the same API that powers Perplexity's answer engine.

Start with a free AI Readiness Check to see how AI-visible your website is right now. Takes 30 seconds.

Share

Find out if AI recommends you

We scan your site, generate your AI profile, and continuously monitor 40 prompts about your business. $19/mo.

Get Started Free