Tag: Prompt Tracking

Why I Run Each Prompt Once Daily: The Data Behind It

I often get asked why I “only” run each prompt one time per day.

For me, the answer comes down to signal quality. Running a prompt once daily gives me enough consistent data to understand performance without overloading the process with unnecessary repetition.

The statistics show that a single daily run is plenty. It gives me a reliable view of how prompts behave over time, while keeping the workflow focused, efficient, and easier to interpret.

Inspired by this post on Try Profound Blog.

July 7, 2026
Prompt-Level AI Visibility: How I Measure What Matters

I do not measure AI search the same way I measure traditional search, because the user journey is no longer built around one query, one ranking page, and one click.

A prospect might ask ChatGPT for the best CRM for manufacturing companies, compare options in Google AI Mode, refine the requirements across several follow-up questions, and build a shortlist without ever visiting a website.

If my company appears in those conversations, I have influenced the buying process. The hard part is proving that influence with a measurement system I can trust.

Prompt-level visibility has become one of the fastest-growing areas of AI search optimization. It is also one of the easiest to misunderstand. I see plenty of promises about complete visibility into AI conversations, but the reality is far more complicated.

Here is how I think about what can be measured today, what cannot be measured reliably, and how I would build useful reporting despite the current limits.

A 5-step framework I use to track AI visibility

1. I accept that AI does not have traditional rankings

The first mistake I avoid is trying to recreate an old SEO ranking report. There is no universal position one inside ChatGPT.

The same prompt can produce different responses depending on conversation history, user location, personalization, follow-up questions, model version, web retrieval availability, and timing.

That means visibility is probabilistic rather than deterministic. Instead of asking, "Do we rank?" I ask, "How often are we included across the conversations that matter?"

That shift changes the entire measurement model.

2. I build a prompt library instead of only a keyword list

Keywords still matter, but I no longer treat them as enough on their own.

Instead of tracking only individual search terms, I build a library of prompts that reflects how real buyers research, compare, validate, and challenge their options.

I usually organize those prompts by intent. Discovery prompts ask for the best platforms in a category. Comparison prompts put vendors side by side. Evaluation prompts focus on specific use cases. Validation prompts ask whether a company is worth the cost. Objection prompts explore disadvantages. Alternative prompts ask what to use instead. Implementation prompts test how difficult a product may be to adopt.

Instead of monitoring 10 keywords, I may monitor 200 to 500 prompts across the full buying journey. That gives me a much more realistic view of AI visibility.

3. I measure prompt clusters, not isolated questions

One prompt rarely tells me enough to make a decision.

For example, "best CRM software" might not mention my company, while "best CRM for manufacturing companies" might. A more specific prompt, such as "CRM for manufacturers with field sales teams," could return a different set of recommendations altogether.

That is why I group similar prompts into clusters. A category cluster might include best project management software, best PM platform, and project management tools. An industry cluster might include best CRM for healthcare, manufacturing, and finance. A feature cluster might include CRM with AI automation, forecasting, or enterprise sales support.

The patterns across those clusters are more reliable than the result from any single prompt.

4. I combine synthetic prompts with real customer questions

This is where measurement becomes more difficult.

Most organizations do not know exactly what customers are typing into AI assistants, so I often start by generating synthetic prompts. That may include expanding keyword research into conversational questions, creating AI-generated prompt variations, and building comparison, objection, and follow-up prompts.

Synthetic prompts are useful because they are repeatable, but I do not treat them as perfect. Generated prompts often sound cleaner and more structured than real user behavior.

A real buyer might ask something much richer, such as: "We are a 250-person SaaS company with a small HR team. We already use Workday but need something better for payroll. Budget is not a huge issue. What would you recommend?"

That is much more useful than a short phrase like "best payroll software."

For the strongest measurement program, I use synthetic prompts for consistent benchmarking and then supplement them with real questions from sales calls, customer interviews, support conversations, community discussions, internal search logs, on-site search, and AI transcripts that customers voluntarily share.

I also assume the prompt library will need to change. Customer language evolves, and the measurement set has to evolve with it.

5. I measure multi-turn conversations

Most AI-assisted buying journeys do not happen in a single prompt. A buyer may start by asking for the best cybersecurity vendors, narrow the list to companies strong in healthcare, ask which ones integrate with CrowdStrike, and then compare pricing.

My company may not appear in the first answer, but it may become highly recommended by the third response.

If I only measure the opening prompt, I miss a large share of meaningful visibility.

That is why I want prompt tracking to evaluate full conversation paths, not just one-shot questions. Multi-turn testing often reveals patterns that single prompts hide.

The AI visibility metrics I care about most

Many traditional SEO metrics do not translate neatly to AI search. Rankings, clicks, and impressions still have value, but they no longer tell the whole story.

I focus on measurements that show whether a brand appears, how it is positioned, and how consistently it is recommended inside AI-generated responses.

Inclusion rate

If I could track only one AI visibility metric, I would start here.

Inclusion rate measures the percentage of tracked prompts where my brand appears in the AI response. If I monitor 500 prompts and my company appears in 185 of them, the inclusion rate is 37%.

That number is useful as a benchmark, but it becomes more valuable when I segment it by buying stage, product category, industry, geography, or AI model. Those slices often reveal opportunities that a single overall average would hide.

Position within the response

Being mentioned is not the same as being recommended.

Old search marketing tools give way to a faster, connected future, with data streams, AI icons, and a glowing search hub symbolizing SEO innovation and community growth.

I want to know whether my brand appears as the first recommendation, one of the first few options, a late mention, or merely an alternative. If the AI response includes a comparison table, I also want to know where my company appears there.

AI answers do not have traditional rankings, but prominence still matters. A top recommendation is more likely to shape a buyer’s perception than a passing mention several paragraphs later.

Brand framing

Visibility tells me whether my brand is included. Brand framing tells me how it is described.

There is a meaningful difference between an AI system describing a company as "widely considered an enterprise leader" and describing it as "best suited for smaller teams." Both may sound positive, but they position the brand very differently.

I look for recurring themes around strengths, weaknesses, differentiators, pricing, ideal customer profile, and competitive comparisons. Over time, those patterns can expose messaging gaps in my own content or show how the broader web is shaping AI’s understanding of the brand.

Sentiment and confidence

Sentiment is more than a simple positive-or-negative label. I also want to know how confidently the AI system presents my brand.

"Company A is generally considered the strongest option" carries a very different level of conviction than "Company A may be worth considering."

Neither statement is negative, but they do not create the same buyer impression. Tracking confidence, uncertainty, caution, skepticism, and strong endorsement gives me a more nuanced view of how AI systems present the company to prospective customers.

Competitive share of voice

My own visibility is only part of the picture. I also need to know how often competitors appear alongside me or instead of me.

If my inclusion rate stays at 40% month after month, that may look disappointing. But if every major competitor dropped by 20 percentage points after a model update, the story changes.

On the other hand, if one competitor jumps from 35% inclusion to 70% while everyone else stays flat, I would want to investigate what changed.

Competitive share of voice helps me separate category-wide movement from changes that are specific to my brand.

How I view the AI visibility tool landscape

The market for AI visibility platforms has grown quickly. Each product approaches the problem differently, but most are trying to answer the same core questions: does my brand appear, how often does it appear, which AI models include it, which competitors show up, and how is the brand described?

Many platforms now include prompt libraries, competitive benchmarking, citation tracking, answer monitoring, and trend reporting. These features can reduce the manual work required to test hundreds or thousands of prompts on a recurring basis.

Still, I have to be clear about what these tools are and are not measuring.

No tool has access to every AI conversation happening in the wild. Most rely on controlled prompt libraries, repeatable testing environments, or sampled interactions to create a representative view of visibility.

That is useful, but it is not the same as observing every real user interaction.

What I still cannot reliably track

This is the part I do not want to gloss over.

Even though AI measurement is improving quickly, some data is still not observable. I cannot comprehensively track every prompt where my brand appeared, every conversation that influenced a purchase, every recommendation made inside ChatGPT, every citation shown to every individual user, or exactly how personalization changed a response.

I also cannot see every multi-turn conversation across every AI platform or know how often someone acted on an AI recommendation without clicking a link.

The underlying AI platforms do not expose that level of data. If a vendor claims it can see every AI conversation involving my brand, I would ask exactly how that information is being collected.

The practical framework I would build

Rather than chasing perfect attribution, I focus on building a repeatable measurement system that I can track consistently over time.

For visibility, I would track inclusion rate, competitive share of voice, prompt coverage, and model coverage.

For response quality, I would track position within the response, brand framing, sentiment, and message consistency.

For technical signals, I would track citation frequency, content retrieval success, entity consistency, and freshness.

For business outcomes, I would look at AI referral traffic, assisted conversions, branded search lift, direct traffic trends, and pipeline influenced by AI discovery.

No single metric tells the full story. Together, these signals give me a more complete picture of how the brand is showing up and how it is being perceived across AI-assisted research.

The goal is not perfect measurement

Prompt-level visibility is not as mature as keyword tracking became over the past two decades.

Some signals are still emerging. Others remain inaccessible because AI platforms do not expose the underlying data. At the same time, user behavior is changing almost as quickly as the technology itself.

That does not mean measurement is impossible. It means the objective has changed.

Instead of trying to reconstruct every AI conversation, I focus on building a representative prompt library, tracking visibility consistently, benchmarking against competitors, and understanding how my brand is being framed.

Those trends are far more actionable than chasing a level of precision the current ecosystem cannot support.

The organizations making the most progress in AI search are not waiting for perfect attribution. They are establishing baselines, watching for meaningful movement, and adapting as both AI models and user behavior continue to evolve.

Inspired by this post on Search Engine Land.

July 6, 2026
How I Turn Proprietary Data Into AI Citations
When I want a page to feel genuinely original, I start with original numbers. They are still one of the most reliable ways to make content stand apart, especially when those numbers come from the business itself instead of a one-off study created just to fill a content calendar.

The old approach was to pay a PR or research firm for a loosely related survey, like a car insurance FinTech commissioning road-trip research to earn a mention in Yahoo. I see that play as increasingly outdated. Almost every product now creates data worth publishing, and extracting that data is easier than it has ever been.

I do not need a full research department to compete here. The bar for standing out is lower than many teams assume.

View embedded content

First-party data: The strongest correlation of originality

On-Page.ai’s recent information gain study scored 150 top-3 Google pages across 50 keywords and 10 verticals. The study looked at how much each page added beyond the rest of its ranking cohort, grading contribution from 0 to 100 by meaning rather than wording.

The median page scored 52. More importantly, original data correlated with that score more strongly than any other page-level trait, including content length.

Pages with at most 1 unique figure averaged an information gain score of 40.2. Pages with 15 or more unique figures averaged 62.1, and the score increased steadily at every step in between.

The good news is that the bar is not especially high. The study found that top organic results usually include only 4 unique data points on average. If I publish a page with more than 4 real original claims, figures, or answers, I create another lever for earning visibility in increasingly competitive organic search.

The analysis also found that almost every search leaves adjacent questions unanswered. On-Page used synthetic reader questions, meaning plausible related questions generated for the study, and found room for new pages to answer those questions more completely. That immediately reminds me of query fan-out.

I saw a similar pattern in an analysis of ChatGPT citations.

“A single evergreen page covering 10+ query intents is worth more in AI citation reach than 10 single-intent pages. The ROI of comprehensive content is front-loaded: one well-built page compounds citation reach over time. The long tail exists, but the top 5% of pages capture a disproportionate share of ongoing citation activity.” – The science of how AI picks its sources

That is why I believe high-intent prompts should be monitored across the full buyer journey. I would map them across the five stages from Reasoning Lift: Problem, Exploration, Comparison, Validation, and Selection. I would also use more accurate AI prompt tracking to understand where those questions emerge, then answer them with the kind of knowledge only the brand can provide.

My main takeaway is simple: most pages are only middling on originality, genuinely original pages are still a minority, and scoring high enough to stand out is achievable without an extraordinary lift.

The limitation is just as important. This study focuses on classic search visibility and rankings, which makes sense because the SEO concept of information gain comes from Google patent language. It does not analyze AI citations or mentions, and it does not appear to include AI Mode or AI Overviews.

Caveat: Being the primary source may not win the citation

This is the part of proprietary data advice I think gets skipped too often. Everyone says to publish original research. Far fewer people test whether AI rewards the brand that created the number or the page that presents it in the clearest, most extractable way.

More data analysis is still coming, but based on analyses completed at Growth Memo over the last year, I already see two patterns worth paying attention to.
- The entity types that predict ChatGPT citations the most are DATE and NUMBER (from The science of what AI actually rewards). Highly cited pages tend to be dense with specific entities, such as a particular methodology, a precise statistic, or a named comparison. Even when another source picks up my proprietary findings and gets cited instead, those external third-party authority signals can still build over time.
- Entity-richness and balanced sentiment matter (from The science of how AI pays attention). Generic advice is vague and risky. Specific entities are grounded and verifiable. Proprietary data can produce, verify, validate, and create entity-rich content at the same time. I can explain why a feature saves a certain percentage of dollars, how many hours clients save, or how performance compares with previous vendors. When I add balanced sentiment to the analysis and explanation, I get a stronger tactic from the same asset.
If the hypothesis holds that first-party data is crucial in the era of AI search, then publishing proprietary data is necessary, but it is not enough. LLM extraction structure, along with the sites AI search engines already trust for a topic, helps decide who actually earns the citation, even when the brand owns the data.

That is the frustrating part: an aggregator can repackage my benchmark into a cleaner, answer-ready page and collect the citation my research earned.
- Who wins: Brands that already have proprietary product, usage, or pricing data and also structure that data for extraction while continuing to build organic brand authority. This connects directly to How to build an AI SEO strategy that outlasts tactics.
- Who loses: Brands publishing opinion content that any tool can replicate, brands ignoring off-site authority, and primary sources that bury their own numbers inside narrative instead of surfacing them clearly.
I do not yet know whether some verticals reward data content more than others. The science series found that citation signals vary sharply by vertical, so I would be surprised by a uniform payoff. Still, I would not claim a pattern without data.

How to structure data for extraction

Owning the data gets me into the visibility race. How I structure that data may decide whether I win the citation.

In an analysis of 18,012 verified ChatGPT citations, we found a ski-ramp distribution: 44.2% of all citations came from the first 30% of a page. The middle 30-70% earned 31.1%, and content buried deep in a long post was roughly 2.5x less likely to be cited.

The follow-up analysis across 7 verticals made the target even clearer. The 10-20% band of a page is where AI reads hardest in every vertical, while the first 10% is usually navigation and intro filler that AI skips. The bottom 10% of any page earns only 2.4-4.4% of citations regardless of vertical.

When I apply that to a data study, the structure becomes straightforward.
- I lead with the headline statistic. My strongest number belongs in the first 30% of the page, ideally right after the title block where the 10-20% band begins. I want the number, the comparison, and the implication visible quickly.
- I define the metric immediately. I include one sentence explaining what the number measures and which population it covers. An undefined statistic is harder to extract with confidence.
- I box the methodology. I make the sample size, time window, and collection method easy to find in a short labeled block. Attribution confidence is part of what makes a number citable.
- I front-load every secondary finding. I rank findings by strength, with the strongest first. A 20-paragraph narrative buildup may help human suspense, but it can cost machine citations.
- I skip the suspense close. AI reads more like a busy editor than a patient student. The payoff-at-the-end structure that worked for ultimate guides often works against extraction.
This post first appeared on the author’s website and is republished here with permission.

Inspired by this post on Search Engine Land.
July 3, 2026
Mastering Prompt Tracking: Strategies for Accurate AI Insights

I’ve come to realize that prompt tracking is often misunderstood as mere noise, but it’s actually a golden opportunity to refine AI interactions through a structured approach.

AI responses can be unpredictable. However, by utilizing repeated runs, establishing fixed sampling rules, and calculating confidence intervals, we can transform variance into a trustworthy metric.

By embarking on this journey with me, you’ll soon be equipped to create a reliable AI tracking system.

You’re already ahead if you’ve embraced persona-based prompt design as discussed in Synthetic Personas for Better Prompt Tracking.

For those immersed in AI SEO strategies, understanding the true trajectory of your efforts over the noise is crucial. Explore more with How Much Can We Influence AI Responses.

While many have dismissed prompt tracking due to its variability, I’ve discovered that it mirrors the unpredictability seen in weather forecasts and credit scoring, which are still meticulously tracked.

Reflecting on keyword tracking’s evolution, I see a parallel path for prompt tracking, which requires adapting its methodology to account for the numerous platforms now at play.

At pivotal industry events, experts speak of a shift from single search queries to a conversational model, emphasizing the changing landscape we must adapt to.

The shortcomings of current prompt-tracking tools are evident in their lack of innovation, yet I believe we can rise above with a more strategic approach.

Although single-turn prompts provide limited insight, constructing full conversational sequences reveals persistence, a vital metric often overlooked.

Imagine tracking a B2B SaaS CRM journey through defined stages, extending prompts to capture decision-making across multiple touchpoints to truly gauge influence.

HubSpot’s visibility across platforms like ChatGPT and Perplexity illustrates the nuanced understanding needed to strategize investments in brand-centric content.

The future of prompt tracking resembles opinion polling, employing systematic and repeatable methodologies to extract meaningful data amidst variability.

This piece first appeared on the author’s website and is shared with permission here.

Inspired by this post on Search Engine Land.

June 10, 2026
Why AI Searches Differ: Insights from ChatGPT and Beyond

Whenever I type a question into an AI engine, I’ve noticed that the engine doesn’t just search for the exact words I typed. Instead, it explores a broader spectrum of possibilities. This behavior intrigues me.

Recently, I came across a fascinating study by Profound. They monitored 10,000 prompts across various AI platforms like ChatGPT, Copilot, and Perplexity over two weeks. The findings highlighted remarkable differences in how these AI engines search and process queries.

Inspired by this post on Try Profound Blog.

April 30, 2026
Revealing Trends: Only 10% of ChatGPT Prompts Trigger Shopping

After tracking an incredible 2 million ChatGPT prompts, I found a surprising trend: shopping appears in less than 10% of them. Diving deeply into the data over nine months, it was clear that a staggering 79% of prompts simply never activated a shopping response.

What intrigued me further was the persistence of those that did trigger shopping. There was an impressive 83% chance they would do so again the following day. However, this persistence isn’t indefinite. Model updates seem to wash away those triggers overnight.

In my quest to understand these patterns, I analyzed 26 million prompts across 13,000 categories. The goal was to pinpoint where shopping emerges, how reliable this occurrence is, and what insights this holds for brands shaping their strategies on a platform where responses are sparsely shopping-oriented.

Inspired by this post on Try Profound Blog.

March 5, 2026
Unlock AI Prompts in Google Search Console: A Step-by-Step Guide

I’ve been asked numerous times about how to track prompts effectively, especially by those using tools like Profound, Athena, and Peec. The big question on everyone’s mind is, “Which prompts are worth tracking?” In this ever-evolving landscape, it’s challenging to determine what buyers are querying about my company when they use LLMs.

Currently, there isn’t a reliable data source that puts my mind at ease. Unlike traditional search with publicly available Keyword Planner data, it’s unlikely that OpenAI or Google will fully release this kind of data for analysis. Though there have been recent proposals by the UK CMA about Google and data transparency, I’m not holding my breath for significant change.

Long story short, LLM tracking feels like navigating a black box. So, are there any alternative data sources we can use to track which prompts? Perhaps.

Back in November, Jason Packer published an interesting report highlighting how ChatGPT searches accidentally leaked into Google Search Console reports, featuring PII. When this was confirmed by Ars Technica, OpenAI stated the problem affected only a small number of queries.

This confirmed, for me, that ChatGPT queries do appear in some Search Console profiles. While privacy implications are significant and beyond this article’s scope, it shows that LLM queries are not impossible to capture.

Additionally, Barry Schwartz has reported that AI Mode data is available in Search Console. This supports the idea that Search Console can track how users interact with LLMs.

Based on my analysis, it seems that AI data appears to come from this area. By applying specific filters, I’ve noted steady increases in impressions over recent months, coinciding with Google’s roll-out of AI Mode features.

So, how can I access user prompt data in Search Console? The key is focusing on longer queries. Using regex, we can filter queries with 10 or more words, unveiling prompt-like behavior:

1. Navigate to Search Console Performance > Search Queries

2. Select Add Filter > Query

3. Choose Custom Regex

4. Input: ^(?:S+s+){9,}S+$

This method revealed understandable, prompt-styled queries when applied to various properties. Though the actual data cannot be shared, examples such as “Map out a full day in Glacier National Park…” highlight the trend.

Mind you, there’s no direct evidence these queries originate from ChatGPT or similar AI platforms. It’s possible they reflect new user behavior patterns within Google.

Regardless, analyzing these conversational query patterns provides invaluable insight into how customers search using longer strings.

Will Critchlow wisely said, “we’re doing business, not science.” In our shift toward less attributed, zero-click data collection, the choice to leverage this available data is up to us.

Currently, my preferred tool for prompt analysis is Claude. Its results are reliably robust, and its visualizations are effective. Integrating Claude into existing frameworks streamlines the process.

After export, uploading prompt lists to Claude lets it perform behavioral analysis, identifying data themes and trends for better prompt tracking.

Posing specific questions to Claude about customer behavior opens a treasure trove of insights. Analyzing this data reveals learning opportunities I would not have anticipated.

For instance, I discovered searches probing a PR issue from over three years ago are still frequent and that searches often use one company as a benchmark against its competitors.

Finally, leveraging Claude to suggest new prompt-tracking methods, based on this data, offers an informed way to continually hone tracking efforts.

While there’s no definitive system for selecting which prompts to track, incorporating Search Console data provides a clearer direction. The insights derived can help unearth unique user prompts and discern scalable themes for ongoing data tracking.

This piece originally appeared on the Nectiv blog [as How To Mine Google Search Console For Conversation Data (Regex Included)] and is republished with permission.

Inspired by this post on Search Engine Land.

February 27, 2026