Why Most ChatGPT Sources Aren’t Cited: Key Findings Revealed

```json
{
  "alt": "Digital illustration of a speech bubble with multiple browser windows branching out on a dark abstract background.",
  "caption": "Visualize the digital flow of information with this dynamic illustration of interconnected browser windows, representing data streams emanating from a central speech bubble.",
  "description": "This digital illustration features a central speech bubble icon from which multiple browser windows extend, symbolizing data flow and connectivity. The dark blue, abstract background is adorned with geometric patterns and light trails, enhancing the sense of digital communication. The composition reflects online interaction and information exchange, making it ideal for technology and communication themes. Keywords: digital communication, data flow, technology, browser windows, information exchange."
}
```

When I think about how ChatGPT retrieves information, I find it fascinating that most sources it pulls in don’t make it to the final answers. According to a report by AirOps, a whopping 85% of the sources identified by ChatGPT never appear in its final response.

Why this matters to me. If I’m aiming to have my content mentioned in AI-generated answers, it’s clear that simply being discovered by the AI isn’t sufficient. Most pages that get retrieved ultimately don’t get the exposure I’m hoping for.

Key insight. It’s interesting to note that just because a page ranks and is retrieved doesn’t mean it gets cited. My content has to align closely with the prompt or the context it supports to be chosen.

Per the report: the focus shifts to how well I can optimize my content for selection in the AI synthesis process, beyond just showing up in the search results.

By the numbers:

82,108 citations appeared in final responses, but only 15% of the retrieved pages were mentioned. That means 85% of the pages that surfaced during research didn’t make it into the answers.

Citation rates also varied based on query type:

18.3% for product discovery queries, 16.9% for how-to queries, and 11.3% for validation searches.

Fan-out queries. I noticed that when ChatGPT generates an answer, it often triggers additional internal searches, resulting in a “second citation surface.” This stood out in the dataset findings:

89.6% of prompts prompted two or more follow-up searches. Fan-out searches expanded 15,000 prompts into 43,233 queries. Interestingly, 32.9% of the cited pages were results from these fan-outs and not the original prompt.

95% of fan-out queries had zero traditional search volume.

Google ranking correlation. I’ve learned that high rankings in Google significantly improve chances of citation:

55.8% of cited pages ranked within Google’s top 20. Pages in Position 1 were cited 3.5 times more often than those outside the top 20.

About the data. AirOps examined 548,534 pages from 15,000 prompts to understand how ChatGPT expands queries and selects which citations to include.

The study. For those interested in diving deeper, check out The Influence of Retrieval, Fan-out, and Google SERPs on ChatGPT Citations.


Inspired by this post on Search Engine Land.


crushpress.ai community screenshot

FAQs

What percentage of sources retrieved by ChatGPT are cited?

AirOps analysis shows only 15% of sources retrieved by ChatGPT are cited in final answers. The remaining 85% do not appear in the AI’s final response.

Why might a retrieved page not be cited?

Just because a page ranks and is retrieved doesn’t guarantee it will be cited. Content must align closely with the prompt or the supported context to be chosen.

How do citation rates vary by query type?

Citation rates differed by query type: 18.3% for product discovery, 16.9% for how-to, and 11.3% for validation searches. These figures show how intent affects citation likelihood.

What is fan-out in ChatGPT citations?

Many prompts trigger additional internal searches, creating a ‘second citation surface.’ In the dataset, 89.6% of prompts prompted two or more follow-up searches, expanding 15,000 prompts into 43,233 queries.

How does Google ranking affect citations?

55.8% of cited pages ranked within Google’s top 20. Pages in Position 1 were cited 3.5 times more often than those outside the top 20.

What is the scope of AirOps' data?

AirOps examined 548,534 pages from 15,000 prompts to understand retrieval and citations. The study is The Influence of Retrieval, Fan-out, and Google SERPs on ChatGPT Citations.

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *