MethodologyMarch 8, 20268 min read

The hedging problem: why most AI-search reports are useless to a CMO

If your AI-visibility report counts every mention as a win, it's lying to you. The single biggest measurement bug in this category is treating a hedged mention the same as a recommendation. Here's what to demand instead.

Show me the AI-visibility report a vendor handed your team last quarter. I’ll bet you a coffee it has a number on the cover that says something like “your brand was mentioned in 47% of AI answers in your category.” That number is almost always wrong, in a specific and predictable way: it counts a hedged mention the same as a recommendation. The report treats “you might also consider Brand X” as identical to “Brand X is the leader in this space.”

Those two sentences are not the same. They do not move pipeline the same. They do not signal the same level of category authority. They do not even mean the same thing to the buyer reading them. And yet most of the AI-visibility category flattens both into the same “mentioned” bucket, because doing so is easy and produces a bigger number for the vendor’s own pitch deck.

This post is about why the hedging problem matters, how it warps decisions, and what an honest measurement framework looks like. If you take one thing away: refuse to look at any AI-visibility metric that doesn’t separate endorsement from hedge.

The anatomy of a hedge

Walk through the typical “mention” an AI-visibility tool will count as a win. The buyer’s prompt is something like “what’s the best CRM for a 50-person sales team.” The model answers with a paragraph. Halfway through, your brand appears in a sentence that reads:

“Other tools in this space include [Competitor A], [Competitor B], and [Your Brand], though the right choice depends on your specific workflow.”

Congratulations, your dashboard ticks up. You were “mentioned.” In reality, you were name-dropped at the bottom of a list, behind two competitors, and immediately undercut by a hedge. A buyer reading that paragraph will not contact you. They will contact whoever the model named first with conviction.

Now compare to a different answer to the same prompt:

“[Your Brand] is widely considered the strongest choice for sales teams of that size, particularly because of its [specific feature]. Pricing is competitive and the onboarding is well-regarded.”

Same brand, same prompt, same engine. The first answer is a hedge that costs you a deal. The second is the kind of endorsement a backlink-era SEO team would have spent $50,000 of agency time to engineer. Treating them as the same data point is malpractice.

Why most vendors hide this from you

The cynical answer is the right one: separating endorsement from hedge makes the topline number smaller. A vendor whose whole pitch is “we’ll show you how often you’re mentioned in AI answers” has a commercial incentive to count generously. The bigger the percentage, the better the dashboard looks in the demo, the easier the renewal.

There’s also a technical excuse. Distinguishing an endorsement from a hedge requires the measurement system to do real natural-language analysis on the AI’s own output — not just string-match for your brand name. That’s more expensive to build, more expensive to run, and harder to explain. Some vendors genuinely don’t do it because they haven’t built the capability. Others don’t do it because doing it would shrink the headline number they sold you on.

Either way, the buyer of an AI-visibility tool is the one getting the bad data.

The five buckets every honest report should use

A measurement system that takes the hedging problem seriously will sort every brand mention into roughly five categories. These don’t have to be exposed verbatim in the dashboard, but they should exist somewhere in the underlying scoring.

Recommendation.The AI explicitly names your brand as the right answer or one of the top one or two right answers. Example: “The leading option is X.” This is the only bucket that consistently moves pipeline.
Positive mention.The AI names your brand favorably but as one of several roughly-equal options. Example: “X is well-regarded for [feature].” This builds authority over time but rarely closes a deal on its own.
Neutral mention.The AI names your brand factually with no qualitative valence. Example: “X offers a tool in this category.” This counts as presence — better than absence — but it is not endorsement.
Hedged mention.The AI names your brand and immediately qualifies away its endorsement. Examples: “though it depends on your needs,” “may not be suitable for,” “some users have reported.” A hedged mention is barely better than not being named at all, and in some buyer journeys it’s worse — it signals to the buyer that they should worry.
Negative mention. The AI names your brand while attributing a problem, weakness, or warning to it. This is rare in B2B prompts but lethal when it happens. A good measurement system flags negative mentions for human review the same week they appear.

How the hedging problem warps real decisions

We’ve seen four specific failure modes in marketing teams that bought AI-visibility tools without demanding endorsement separation. They are worth describing because they’re not hypothetical:

The false-confidence Board update.CMO reports “we’re mentioned in 60% of AI answers in our category.” Board approves another year of budget. The actual recommendation rate is 8%. Pipeline does not improve. The vendor is not asked back the following year, but the budget cycle is already locked.
The wrong content investment.Team sees their “mention rate” is high and concludes they’re winning. They reduce content investment in the category. Six months later, recommendation rate has collapsed because competitors invested while they coasted. The hedged mentions kept the topline number high while the underlying quality eroded.
The misallocated PR budget. Team chases coverage in a publication that, in fact, increases their mention rate but does so in articles that the AI models read as ambivalent. Coverage goes up, recommendation rate goes flat. The PR firm gets renewed; the brand stays stuck.
The phantom competitive threat.A competitor’s “mention rate” surges and sets off internal panic. Closer inspection shows their surge is almost entirely hedged mentions of the “some users have reported issues with” variety — the AI is talking about them more, but warning buyers off them at the same time. The right competitive response was to do nothing.

What to demand from your measurement system

You don’t need to build any of this yourself. You just need to refuse to accept reports that don’t separate these signals. Three concrete demands a CMO can put to any AI-visibility vendor today:

Show me the recommendation rate, separately from the mention rate.If the vendor can’t do this, they are scoring on string-match alone. Walk away or accept that the headline number is decorative.
Show me an example of a hedged mention you flagged last week.A vendor that takes hedging seriously will be able to pull a real example in 30 seconds and show you the exact qualifier that triggered the flag. A vendor that doesn’t will hedge themselves.
Tell me what counts as an endorsement in your system.Get the operational definition. If the answer is anything close to “the model uses your brand name,” you are buying noise. The honest answer should describe a multi-step semantic check on the AI’s own output: position in the answer, presence of qualifiers, whether competitors are mentioned more favorably in the same response, and so on.

The hedging problem is not a niche measurement quibble. It is the difference between a CMO who walks into the next executive review with a real read on AI visibility and one who walks in with a confidence-inflated number that will embarrass them by Q3. The fix isn’t hard. It just requires saying out loud that not all mentions are equal, and refusing to spend money with vendors who pretend they are.

Written by The Enso team. Have a question or correction? Email us.