Share of AI Voice — How to Measure It Honestly for Local Business
Share of AI voice is the percentage of buyer-intent AI answers that name your business versus tracked competitors on the same prompts. Measure it by sampling multiple platforms with consistent prompt libraries, logging mention rates per engine, and resampling monthly — no vendor can guarantee placement, but honest baselines reveal whether your signal work is moving the needle.
Why share of AI voice exists
For decades, local marketing teams reported share of voice in paid media — how often your brand appeared in ad auctions versus competitors. Organic search added share of SERP — how many page-one slots you held for tracked keywords.
AI assistants introduce a third layer. When a buyer asks ChatGPT, Gemini, or Perplexity who to hire, the model typically names zero to three businesses in fluent prose. There is often no click. There is often no traditional ranking position to screenshot.
Share of AI voice (SOAV) answers a competitive question: of the times AI names anyone in your category near your market, how often is it you?
If you appear in 12 of 100 sampled answers and your three tracked competitors appear 88 times combined, your SOAV is roughly 12%. That number is more actionable than "ChatGPT knows us" or "we rank #2 on Google."
This guide explains how to measure SOAV honestly — without inflated guarantees, without single-screenshot vanity, and without confusing SEO metrics with AI recommendation metrics.
Extended context: What is AEO? · AEO services.
Share of AI voice vs mention rate
These terms overlap in conversation but serve different decisions.
Mention rate is binary at the prompt level: on this specific buyer-intent question, did the model name your business yes or no? Aggregate mention rate is the percentage of prompts where you appear at least once.
Share of AI voice is competitive: when models name businesses in your category, what fraction of those name slots belong to you versus named competitors?
| Metric | Question it answers | Example |
|---|---|---|
| Mention rate | Are we visible at all? | Named on 34% of 30 prompts |
| Share of AI voice | Are we winning the recommendation layer? | 18% of all competitor name slots |
| Platform blind spot | Where are we invisible? | 0% on Claude, 41% on Gemini |
| Accuracy rate | When named, are facts correct? | Wrong phone on 2 of 12 mentions |
A business can have moderate mention rate but low SOAV — you appear sometimes, but competitors dominate most answers. That pattern suggests signal gaps competitors have closed: review themes, listing completeness, or entity clarity.
A business can have high SOAV in a thin sample — you dominate three prompts nobody else tracks. That is why methodology matters.
What SOAV is not
Clarity prevents wasted budget.
SOAV is not organic keyword rank. Ranking #1 for "plumber Austin" does not guarantee Gemini names you when a user asks conversationally for same-day leak repair.
SOAV is not website traffic. Zero-click AI sessions resolve without Analytics firing. Mention tracking and call attribution close the gap. Read: Zero-click AI searches.
SOAV is not a single ChatGPT screenshot. Models vary by session, browsing mode, and version. One query is anecdote.
SOAV is not guaranteed by any vendor. OpenAI, Google, Anthropic, and Perplexity control their products. Ethical programs improve verifiable inputs and report trends — they do not sell placement. See how AI assistants choose businesses.
The measurement stack
Think in four layers: prompt design, platform sampling, scoring rules, and attribution.
Layer 1 — Prompt design
SOAV only measures what your prompts represent. Bad prompts produce bad KPIs.
Use buyer-intent language, not brand-leading questions:
- "Who should I hire for emergency roof repair in [city]?"
- "Best pediatric dentist accepting new patients near [neighborhood]"
- "Recommend a licensed electrician for panel upgrade in [metro]"
Avoid "Is [Your Business Name] reputable?" unless you are running a separate reputation audit. Customers hire category-first; your measurement should mirror that.
Parameterize geography the way real buyers speak — city, neighborhood, metro, "near me" phrasing where relevant.
Document prompt versions. When you change prompts, you change the metric. Version your library (v1.0 April baseline, v1.1 added financing intent).
Start with 5–10 prompts per location. Mature programs often track 20–30 across intent clusters: emergency, quality, price-sensitive, specialty.
Manual prompt craft: How to check what ChatGPT says.
Layer 2 — Platform sampling
SOAV without multi-platform sampling is share of one engine's voice — useful, incomplete.
Independent analyses of cross-engine citations show limited overlap between what different AI systems retrieve and cite — on the order of ~11% shared domains in some industry samples. ChatGPT momentum does not imply Perplexity coverage. Gemini may lean Google-local graph while Claude grounds on different page sets.
Minimum credible panel for local SMBs:
- ChatGPT — largest conversational mindshare
- Gemini — Google ecosystem adjacency
- Claude — strong grounding behavior
- Perplexity — explicit citation URLs
- Grok — growing in tech-forward demos
- Google AI Overviews — SERP synthesis layer
Report mention rate and SOAV per platform, then a weighted composite if you want one headline number — document the weighting.
Platform overlap deep dive: The 11% problem.
Automated six-platform baseline: Free AI visibility scan.
Layer 3 — Scoring rules
Define rules before you look at outputs. Post-hoc rules invite optimism bias.
Name slot counting. When a model lists three plumbers, each name is one slot. If you appear twice in one answer (rare), count once unless your methodology explicitly allows duplicates.
Competitor set. Track 3–5 direct local competitors plus your business. Revisit quarterly — new entrants and acquisitions change the set.
Non-mention answers. If the model refuses or gives generic advice without naming businesses, exclude from SOAV denominator or tag separately as "no recommendation." Do not treat non-mentions as competitor wins.
Partial names and chains. "Smith Family Dental on Main" counts if unambiguous. "A national chain" without local franchise attribution may need a human review flag.
Accuracy tagging. Parallel track: when named, were phone, hours, and services correct? Wrong facts inflate visibility while hurting conversions. Repair guide: AI reputation repair.
Layer 4 — Attribution
SOAV measures recommendation share, not revenue. Connect the layers:
- Mention tracking — monthly SOAV rollups
- Call tracking / GBP calls — operational response
- First-party pixels — AI-referred sessions that do click
- CRM source fields — "heard about you from ChatGPT" at booking
You will undercount zero-click impact if you only watch Analytics. You will overcount if you attribute every branded search to AI. Honest programs triangulate.
Building your first baseline
Week one should produce a spreadsheet or dashboard row you can defend in a leadership meeting.
Step 1 — Select market and competitors. Same city and service radius you actually serve. Pick competitors customers name in sales calls, not only who outranks you on one keyword.
Step 2 — Freeze prompt library v1. Ten prompts, buyer-intent, no brand leading.
Step 3 — Run all prompts on all platforms. Fresh sessions where possible. Note browsing mode in ChatGPT if applicable.
Step 4 — Log results in a grid. Rows = prompts; columns = platforms; cells = named businesses + brief rationale if model explains why.
Step 5 — Calculate metrics.
- Mention rate = prompts where you appear ÷ total prompts (per platform and overall)
- SOAV = your name slots ÷ all competitor name slots on prompts where at least one business was named
- Blind spots = platforms where mention rate = 0%
Step 6 — Photograph the gaps, not only the wins. Invisibility on Claude with visibility on Gemini is a platform-specific remediation plan, not failure.
Expected time manually: 2–4 hours per location monthly. Structured scans reduce labor and standardize prompt sets.
Reporting cadence and governance
Monthly resampling fits most competitive local markets. Quarterly may suffice in low-dynamism categories if competitors are inactive — but model updates do not wait for your calendar.
Rolling three-month averages smooth noise from model refreshes. Show monthly dots and the rolling line.
Executive one-pager format:
- Composite mention rate (with platform breakdown)
- SOAV vs competitor set
- Biggest blind spot platform
- Top three signal fixes mapped to modules (reviews, listings, entity)
- Accuracy exceptions
- Explicit disclaimer: no placement guarantees
Version control. When prompts change, mark a break in the chart. Comparing April v1 prompts to June v2 prompts as one continuous series misleads.
Interpreting movement honestly
SOAV will move. Not all movement means your agency earned a bonus.
Signal-driven gains follow verifiable work: review velocity, NAP cleanup, GBP services expansion, schema deployment, Apple Business Connect claim. These are inputs you control.
Model-driven shifts happen when platforms change retrieval, training, or UI defaults. A competitor did nothing; you did nothing; numbers move.
Seasonality affects some categories — tax accountants in March, HVAC in July. Compare year-over-year when possible.
Prompt coverage gaps can inflate SOAV. If you only track prompts where you already win, SOAV becomes a vanity mirror.
When SOAV rises after review campaigns and listing fixes, confidence in causality is reasonable. When SOAV spikes one week with no operational changes, treat as noise until sustained.
SOAV by intent cluster
Aggregate SOAV hides strategic detail. Segment prompts into clusters:
| Intent cluster | Example prompt shape | Why segment |
|---|---|---|
| Emergency | "same-day," "now," "24-hour" | Review theme match matters |
| Quality / best | "best," "top-rated," "highly reviewed" | Volume and rating thresholds |
| Price | "affordable," "financing," "estimate" | Different review language |
| Specialty | "cosmetic," "commercial," "pediatric" | Service list alignment |
| Geographic | neighborhood-specific | areaServed and local pack |
You may dominate emergency SOAV while losing quality SOAV. Fixes differ — emergency needs recent reviews citing speed; quality needs volume and authoritative third-party mentions.
Competitive benchmarks without snake oil
Industry aggregate studies help context, not targets. A dental practice in Phoenix is not a plumber in Cleveland.
Use aggregates directionally:
- Anonymized scan research (when published at sufficient sample size) shows how often businesses are invisible on at least one platform — see scan invisibility study methodology
- Competitive SOAV against your named set is the actionable benchmark
Avoid vendors selling "industry average SOAV 47%" without methodology footnotes.
Tooling choices
Manual spreadsheets — cheapest, highest labor, good for first baseline.
Structured scan products — standardize prompts and platforms; export CSV for leadership.
Enterprise AEO platforms — broader prompt libraries, team workflows, historical storage. Evaluate whether they report per-platform SOAV or collapse into one opaque score.
SEO suites adding "AI visibility" — verify they sample generative engines, not only Google SERP features. SOAV requires answer-engine sampling.
AIrecommend.ai maps scan gaps to eight Growth Engine modules — reviews, GBP, listings, entity profile, data studies, press, awards, Super Pixel attribution — without placement guarantees.
Connecting SOAV to budget decisions
Use SOAV to answer allocation questions:
If mention rate is 0% on two or more platforms — prioritize universal signals: reviews, NAP, GBP completeness, Apple Business Connect. Technical schema alone rarely breaks total invisibility.
If mention rate is healthy but SOAV lags — study competitor review themes and directory breadth. You are in the conversation but losing slots.
If SOAV is strong but calls are flat — accuracy repair, hours, staffing, or conversion path — not more AI optimization.
If SEO is strong and SOAV is weak — you are under-investing in LLM SEO measurement layer. Read AEO vs GEO vs SEO.
FAQ for finance and legal reviewers
CFOs and compliance-minded owners ask fair questions.
"Is this measurable like PPC?" Partially. Prompts are stable; platforms are not. Report ranges and trends, not contractual impression counts.
"Can we contract for 50% SOAV?" No ethical vendor should accept that. Contract for deliverables — review velocity, listing audits, schema deployment, monthly reporting — not AI outputs you do not control.
"Does SOAV replace SEO reporting?" No. Run both. Divergence between rank and mention rate is diagnostic gold.
Getting started this week
- List ten buyer-intent prompts for your primary location
- Run them across ChatGPT, Gemini, and one additional engine minimum
- Calculate mention rate and rough SOAV against three competitors
- Log platforms where you never appear
- Schedule monthly resample
- Optional: run the free six-platform scan for standardized baseline
Share of AI voice is not a mystical brand metric. It is structured counting on the recommendation layer where more local buying decisions start — measured honestly, improved through verifiable signals, and reported without promises no third-party AI platform can keep.
Worked example — plumbing market in one metro
Imagine a plumbing company tracking ten buyer-intent prompts monthly across six platforms — sixty cells per resample. On April's run, the business appears in 14 cells; competitors collectively appear 52 times in those cells where anyone was named.
Mention rate (overall) = 14 ÷ 60 ≈ 23% of platform-prompt combinations include you.
Share of AI voice = 14 ÷ (14 + 52) ≈ 21% of name slots when businesses were named.
Platform split reveals action: 41% mention rate on Gemini, 8% on ChatGPT, 0% on Claude. Budget conversation shifts from "more blog posts" to "cross-platform listing + entity parity while maintaining review velocity."
After May listing cleanup and schema deployment, June shows 18 of 60 cells — movement without claiming causality from one variable. Leadership sees trend, not a placement promise.
This example mirrors how AIrecommend.ai clients report internally — composite metrics plus platform drill-down.
Data governance and client confidentiality
If agencies resample for multiple clients in the same category, isolate prompt libraries and competitor sets per account. Shared spreadsheets invite cross-contamination of competitor names and optimistic benchmarking.
Store raw model outputs (screenshots or JSON exports) for 90 days minimum so month-over-month disputes have evidence. Redact end-user PII if prompts were customized.
When publishing case studies externally, report ranges and methodologies — not cherry-picked prompts where one client dominates.
Further reading: GEO services · AI visibility tracking · How AI assistants choose businesses.