How to Analyze Competitor AI Visibility — Benchmarking Guide
Competitor AI visibility analysis starts with a fixed buyer-intent prompt library run across six platforms, logging which businesses get named, cited themes, and factual accuracy — then computing share of AI voice versus your baseline. No tool guarantees complete visibility into proprietary models, but structured rescans reveal who wins recommendations when buyers ask who to hire.
Why competitor AI benchmarks matter now
For years, competitive intel meant SEMrush keyword gaps and Local Falcon grid ranks. Useful — but incomplete when the buyer never reaches Google.
A prospect tells you: "I asked ChatGPT for a recommendation and went with the first name." You check Search Console — healthy traffic. You check Local Pack — position two. You still lost.
Competitor AI visibility analysis answers a different question: When buyers ask AI who to hire, who gets named — and why?
This guide walks through a reproducible benchmarking process — prompt design, multi-platform sampling, scoring, and action mapping. We built AIrecommend.ai's scan and monthly rescans around this workflow; you can start manually before automating.
Honest ceiling: you observe outputs, not proprietary ranking code. You infer signal gaps, not guaranteed replication. Anyone selling "competitor AI hacking" is not credible.
Related: how to check what ChatGPT says · how AI assistants choose businesses.
What you are measuring
Define metrics before collecting data:
Mention rate (yours and theirs)
Percentage of prompts in your library where a business is named on a given platform.
Example: 10 prompts, ChatGPT names you in 3 → 30% mention rate for you on ChatGPT.
Track each competitor the same way.
Share of AI voice (SOAV)
Your mentions ÷ (your mentions + sum of competitor mentions) in the prompt set — often shown as a percentage.
Example: Across 10 prompts on Gemini, you appear 4 times, Competitor A appears 6 times, Competitor B appears 2 times → total 12 mentions → your SOAV = 4/12 = 33%.
SOAV is the primary competitive KPI for AEO/GEO — parallel to share of voice in brand monitoring.
Platform coverage
Count of six major surfaces where you appear at least once in the library vs competitors.
A competitor with 6/6 platform presence beats you at 2/6 even if your SOAV wins on one engine — see 11% platform overlap.
Citation and theme logging
When Perplexity or browsing ChatGPT cites URLs, log domains and review themes quoted ("same-day," "financing," "board-certified").
Themes explain why models prefer a competitor — not just that they do.
Accuracy flags
Note when AI states wrong facts about you or competitors — hours, phone, services. Accuracy repair targets listing conflicts — guide.
Step 1 — Build your prompt library
Competitive benchmarks fail when prompts are biased or too narrow.
Rules for good prompts
- Buyer-intent, not brand-led — "Best emergency plumber in Austin" not "Is Joe's Plumbing good?"
- Geography explicit — city, metro, or neighborhood matching your service area
- Intent variants — emergency, quality, price-sensitive, specialty (pediatric, commercial, cosmetic)
- Fixed library — same prompts each rescan for trend validity
- 10–20 prompts per primary location — enough signal, manageable labor
Prompt templates by vertical
Home services:
- "Who should I call for same-day AC repair in [city]?"
- "Recommend a licensed electrician for panel upgrades in [neighborhood]"
- "Best rated plumber for tankless water heater install near [city]"
Healthcare / dental:
- "Dentist for anxious adults in [city] — sedation options"
- "Pediatric dentist accepting new patients in [suburb]"
- "Orthodontist with payment plans in [metro]"
Legal:
- "Estate planning attorney in [county] for moderate estates"
- "Personal injury lawyer with trial experience in [city]"
Hospitality:
- "Best date night Italian restaurant in [neighborhood]"
- "Family-friendly brunch spots in [city] with outdoor seating"
Avoid leading the model toward your brand unless running a dedicated reputation repair track.
Document metadata per prompt
| Field | Example |
|---|---|
| Prompt ID | P-07 |
| Text | "Who should I call for…" |
| Location | Nashville TN |
| Intent tag | emergency / quality / price |
| Date sampled | 2026-06-12 |
| Platform | ChatGPT |
| Browse on/off | on |
Consistency enables month-over-month charts.
Step 2 — Choose platforms
AIrecommend.ai samples six platforms:
- ChatGPT
- Gemini
- Claude
- Perplexity
- Grok
- Google AI Overviews (where available for query class)
Add voice spot-checks separately — voice vs AI chat guide.
Do not benchmark one platform and extrapolate. Overlap of cited domains across engines is often ~11% in industry analyses — competitors win differently per surface.
Step 3 — Run samples reproducibly
Manual method (free, labor-intensive):
- Fresh session per prompt where possible — reduces chat memory bias
- Record full answer text — screenshot plus paste into log
- Extract every named business — include order if ranked
- Note cited URLs on Perplexity / browsing modes
- Record model/version and date if visible
- Test ChatGPT with and without browsing if tier allows — document mode
Automated method:
- Free six-platform scan with competitor comparison tables
- Monthly rescans on Growth / Dominance programs
- Export CSV for board reporting
Sampling frequency
| Market intensity | Rescan cadence |
|---|---|
| Hyper-competitive metro | Monthly |
| Standard service area | Monthly |
| Rural / low churn | Quarterly |
Model updates can shift mention rates within weeks — AI visibility tracking.
Step 4 — Build the competitor mention matrix
Spreadsheet structure:
Rows: Prompt IDs
Columns: Platform × {You, Comp A, Comp B, Comp C, …}
Cells: Y/N named, position if listed, themes noted
Pivot to:
| Business | ChatGPT | Gemini | Claude | Perplexity | Grok | AIO | Total mentions |
|---|---|---|---|---|---|---|---|
| You | 3/10 | 2/10 | 1/10 | 4/10 | 0/10 | 2/10 | 12/60 |
| Comp A | 7/10 | 6/10 | 5/10 | 3/10 | 4/10 | 5/10 | 30/60 |
| Comp B | 2/10 | 1/10 | 3/10 | 5/10 | 1/10 | 1/10 | 13/60 |
Compute mention rate per platform and SOAV overall.
Identify patterns
- Platform specialist competitor — dominates Perplexity but weak on Gemini → citable content vs Google ecosystem
- Review brute-force competitor — named everywhere with "400+ reviews" themes → review velocity gap
- Niche winner — beats you only on specialty prompts → service page + FAQ gap
- Ghost competitor — you lose to a name you do not track in traditional SEO → new AI-era rival
Step 5 — Diagnose why competitors win
You cannot see competitor dashboards. You infer from public signals and AI output themes.
Signal audit checklist (per top competitor)
| Signal | Where to look | AI theme linkage |
|---|---|---|
| Google reviews | Count, rating, recency, text themes | Quoted praise in answers |
| GBP completeness | Categories, services, posts, Q&A | Gemini / AIO strength |
| Apple BC | Claimed? photos? attributes? | Siri / Apple Maps adjacency |
| Directory footprint | Yelp, Angi, vertical sites | Citation diversity |
| Site entity | schema, llms.txt, FAQ | Perplexity / Claude retrieval |
| Citable studies | Local data, press | "According to…" citations |
| Brand fame | Regional press, longevity | Training data bias |
Run the same checklist on yourself — gap analysis, not gossip.
Technical self-audit: llms.txt and schema checklist.
Reviews: Google reviews the right way.
Theme extraction example
ChatGPT names Competitor A: "Highly rated for same-day service and transparent pricing with 200+ Google reviews."
Your gap map:
- Review count deficit → review program
- Missing pricing transparency on site/FAQ → FAQ schema + service page
- Same-day not in GBP services → listing update
Step 6 — Translate gaps to action
Map findings to Growth Engine modules (AIrecommend.ai framework):
| Gap | Module | Example action |
|---|---|---|
| Low mention rate, weak reviews | Reviews | Ethical solicitation, themed responses |
| NAP conflicts, thin directories | Listings | Apple BC, Bing, vertical sync |
| Wrong AI facts | Accuracy repair | Source trace, listing fixes |
| Thin web grounding | Entity profile | JSON-LD, llms.txt, FAQ |
| Perplexity underperformance | Citable studies | Local data report (Dominance) |
| GBP stale vs competitor | GBP autopilot | Posts, Q&A, photos |
Client approval on outbound work remains non-negotiable — no black-hat listing edits.
Programs: Growth $4,997/mo · Dominance $9,999/mo — pricing.
Step 7 — Report to stakeholders
Owners and PE operators need clarity, not screenshot dumps.
One-page monthly summary
- SOAV trend — you vs top 3 competitors, overall and per platform
- Platform heatmap — where you gained/lost mentions
- Top prompt losses — 3 prompts where competitors replaced you
- Theme summary — what AI praises about winners
- Actions taken / planned — tied to signal gaps
- Attributed calls — Super Pixel where available
Honest disclaimers in reports
- Sample prompts ≠ all buyer behavior
- AI outputs vary run-to-run
- No placement guarantees
- Overlap across platforms is low — celebrate platform-specific wins carefully
Advanced techniques
Prompt expansion from sales calls
Mine CRM and call recordings (compliantly) for questions prospects ask — add to library quarterly.
Citation overlap analysis
For Perplexity-heavy categories, log domains cited across competitors. Compute overlap between platforms for your market — validate or refute the ~11% shorthand for your geography.
Accuracy competitive intel
If competitors appear with false superlatives, do not chase spam — fix your accurate density. If AI understates your credentials, strengthen FAQ and schema with verifiable licenses.
Multi-location rollups
Franchise benchmarks: SOAV per DMA, per brand vs local independents, prompt libraries localized per store radius.
Zero-click attribution
Pair mention gains with call tracking — zero-click AI searches. SOAV up but calls flat → conversion or accuracy issue, not visibility alone.
Tools — build vs buy
Manual + spreadsheets: $0 cash, high labor, error-prone without discipline.
Scan subscriptions / agencies: Trade cash for consistency — essential when monitoring six platforms monthly.
Enterprise GEO monitors: Strong observation (Profound-class); local businesses often still need execution on reviews/listings, not dashboards alone.
Evaluation criteria:
- Per-platform mention rates, not blended vanity scores
- Competitor names in standard reports
- Transparent prompt libraries
- Monthly rescan cadence
- Maps gaps to actionable signal classes
Ethical and legal boundaries
Do not:
- Impersonate competitors in prompts to defame
- Post fake reviews to harm rivals
- Scrape private APIs violating ToS
- Misrepresent guaranteed outcomes to clients using competitor data
Do:
- Observe public AI outputs
- Improve your verifiable signals
- Document trends honestly
Common benchmarking mistakes
Mistake: Brand-only prompts. Inflates your mention rate vs reality.
Mistake: Single date snapshot. One Tuesday after a competitor's review surge skews results.
Mistake: Ignoring browsing mode. ChatGPT answers differ materially.
Mistake: One platform strategy. Win ChatGPT, lose Gemini — net loss in Google-heavy markets.
Mistake: Chasing competitor spam. Fake review farms may briefly help rivals; platforms and models shift — build durable signals.
Mistake: No action loop. Benchmarking without listing/review/entity fixes is entertainment.
Case pattern — illustrative
Hypothetical composite common in scans:
You: Strong Local Pack, 85 Google reviews, 4.7 stars, thin FAQ, unclaimed Apple BC.
Competitor A: 220 reviews, active GBP posts, named on Gemini 7/10 prompts.
Competitor B: Local study on "water quality by zip code," Perplexity cites domain 5/10 prompts.
Benchmark result: Your SOAV 22% vs A 48% vs B 30% on combined six-platform library.
Action sequence: Claim Apple BC → review velocity program → FAQ schema on emergency services → Dominance-tier local data study for Perplexity → monthly rescan.
90-day realistic outcome: SOAV climb to low 30s; Perplexity mentions appear; Gemini gap narrows but may not flip until review count converges — no guarantees, typical pattern.
Integrating with SEO competitive analysis
SEO keyword gap reports still matter for site traffic. Merge views:
| SEO metric | AI metric |
|---|---|
| Keyword rank | Mention rate |
| Share of SERP features | Share of AI voice |
| Backlink gap | Citation domain gap |
| Content gap | FAQ / citable study gap |
Divergence diagnose:
- Rank well, low mentions → entity/listing/review gap for AI layer
- High mentions, low traffic → zero-click winning — optimize for calls, not sessions
- Competitor weak in both → emerging rival or data noise — rescan
Framework: AEO vs GEO vs SEO.
When competitor analysis says "you're winning"
Do not stop rescans. Competitors hire agencies, run review campaigns, publish studies. SOAV erodes without maintenance.
Shift focus to:
- Accuracy — wrong facts erode trust
- Attribution — prove AI-sourced revenue
- Prompt expansion — new intents as services grow
- Secondary locations — roll benchmark playbooks
Bottom line
Competitor AI visibility analysis is structured observation — fixed prompts, six platforms, mention matrices, theme inference, monthly rescans. It replaces guesswork about why ChatGPT prefers someone else with a signal gap list you can execute against.
Start manual if you must; automate before competitor SOAV becomes a moat you cannot see.
Free scan with competitor comparison · AI visibility tracking · Budget tiers for 2026.
Frequently asked questions
How do I find out which competitors AI assistants recommend?
Build 5–15 buyer-intent prompts for your category and city, run them on ChatGPT, Gemini, Claude, Perplexity, Grok, and AI Overviews, and record every named business per prompt. Repeat monthly on the same library for trend data.
What is share of AI voice?
Your mentions divided by total mentions of all businesses (yours plus competitors) in a fixed prompt set — expressed as a percentage. It shows competitive share in AI answers, not search rank.
Can I spy on competitor AI ad spend or hacks?
No ethical program reveals private tactics. You observe outputs — who is named and with what themes — and infer signal gaps (reviews, listings, citable content) you can fix on your side.
Why do competitors win on ChatGPT but not Gemini?
Different source mixes and low cross-platform overlap (~11% in industry samples). Benchmark per platform; do not assume one winner generalizes.
How often should I rerun competitor analysis?
Monthly for active markets; quarterly minimum for stable categories. Model updates and competitor review velocity move mention rates faster than classic SEO rankings.