What We Built
We Asked ChatGPT, Claude, Gemini, and Grok to Recommend Our Clients' Products. Most Got Zero Mentions.
Published February 2026 · 10 min read
TL;DR
- 1. We built a free tool that shows whether AI shopping agents recommend your products or your competitors'.
- 2. It generates real customer queries for your category, then runs them across ChatGPT, Claude, Gemini, and Grok. You see exactly who gets mentioned.
- 3. Most mid-market merchants score zero. Not low. Zero. Every AI shopping query in their category goes to someone else.
- 4. Enter a URL, get results in 90 seconds. No signup. No credit card.
We ran our first audit on a $30M auto parts merchant. Eight queries. Four AI platforms. Thirty-two chances for their brand to show up in an AI shopping recommendation.
They got zero mentions. Their Shopify-based competitor got 26 out of 32.
That's what led us to build this tool. We suspected merchants on legacy platforms — Magento, Miva, NetSuite — were invisible to AI shopping agents. Not underperforming. Invisible. So we built a way to prove it: take any store URL, figure out what they sell, generate the questions their customers actually ask AI, and run those questions across ChatGPT, Claude, Gemini, and Grok.
The engineering was the easy part. Watching the results come back was harder.
Phase 1: We browse your store so you don't have to describe it
How do you audit a store you've never seen before? We can't pre-load every merchant's catalog. And asking users to fill out a 20-field form describing their business would kill conversion on a free tool.
So we give Perplexity's sonar model a single instruction: browse this URL and tell us what they sell.
Perplexity has live web access. It actually visits the site, reads the product pages, pulls out categories, price ranges, sample product names. From that, it identifies the vertical (auto parts, supplements, marine, whatever), the target customer, and the price positioning.
Then comes the part that makes the audit useful: query generation.
The model generates eight purchase-intent queries — the kind of questions a real customer would type into ChatGPT when they're ready to buy. We enforce a specific distribution:
- 3 discovery queries — "best brake pads for 2019 Civic"
- 2 specific product searches — "where can I get a cold air intake for a Mustang GT"
- 2 buying scenarios — "I need a turbocharger upgrade for my WRX STI, what's the best option"
- 1 category landscape — "best online stores for performance auto parts"
One rule we enforce strictly: the merchant's brand name never appears in any query. We're testing organic recommendations. If we asked "does ChatGPT know about [your brand]," that's not useful. We want to know if ChatGPT recommends your brand when someone asks about turbocharger kits without mentioning any brand.
Perplexity also identifies 3-5 competitors — retailers and distributors in the same space, not manufacturers. This gives us the comparison baseline.
What the intelligence phase extracts
// Real output from a Perplexity intelligence run
{
"brandName": "Acme Auto Parts",
"vertical": "Automotive Parts & Accessories",
"targetCustomer": "Performance vehicle enthusiasts",
"priceRange": "$45 — $4,200",
"categories": ["exhaust", "tuners", "turbos", "intake"],
"competitors": ["Competitor A", "Competitor B", "Competitor C"],
"queries": [
"best exhaust system for 6.7 Cummins",
"cold air intake for 2022 Ford F-250 diesel",
"I need a tuner for my Duramax, what do you recommend",
// ...5 more
]
}
Phase 2: Same question, four AI brains, see who they recommend
Each query hits four AI platforms at once: OpenAI's GPT-4o-mini, Anthropic's Claude Haiku, Google's Gemini Flash, and Perplexity's Sonar.
Every platform gets the same system prompt: a shopping advisor persona instructed to recommend specific brands and retailers by name, ranked by confidence. Same query, same instructions, four different AI brains. We want to see if the answers converge or diverge.
When the responses come back, we parse them for brand mentions. For the merchant and each competitor, we track:
- Was the brand mentioned at all?
- What position was it mentioned in? (1st recommendation vs. 4th)
- What did the AI actually say about the brand? (We capture 50 characters of context around each mention)
- The full AI response, so you can read exactly what each platform told the "customer"
Eight queries times four platforms equals 32 "slots." Your Share of Voice is the percentage of those slots where your brand appeared.
Example: Share of Voice breakdown
75%
ChatGPT
6 of 8 queries
50%
Claude
4 of 8 queries
13%
Gemini
1 of 8 queries
0%
Grok
0 of 8 queries
Overall SOV: 34% — top competitor at 81%
We also run the same analysis for every competitor Perplexity identified. So you don't just see your own score — you see who's eating your lunch and on which platforms.
Phase 3: The report that sells itself
Everything compiles into a single report page. The centerpiece is a 0-100 visibility score displayed on a circular gauge — green if you're in good shape, red if you're not.
Below that, the report breaks down into sections:
Narrative findings
Not a data dump. Actual sentences explaining what we found, flagged by severity. A critical finding might read: "Your brand wasn't mentioned in any of the 32 AI responses we tested. All purchase-intent queries in your category are being directed to competitors." A warning might say: "You're visible on ChatGPT (63%) but nearly invisible on Gemini (13%). That's 750 million Gemini users who can't find you."
Per-query, per-platform results
Every query is expandable. You can read the full response each AI platform gave. You can see exactly where your brand was mentioned (or wasn't), what position it appeared in, and who the AI recommended instead. This is the part merchants spend the most time on — reading what ChatGPT actually says when a customer asks about their products.
Competitive intelligence
A ranked list of who AI recommends instead of you. Each competitor shows their SOV percentage, which platforms they're strongest on, and how often they appear across all queries. When a merchant sees their top competitor at 81% while they're at 0%, the conversation about fixing it starts immediately.
Revenue at risk
A conservative model that estimates how much revenue is going to competitors through AI channels. For a $20M/year merchant with a zero SOV score, the estimate comes back at $100K-$400K annually. We show the math: estimated revenue times a 1% AI commerce capture rate times the visibility gap percentage. Nothing inflated. Just arithmetic.
Industry benchmarks
We compare your score against category averages. Auto parts merchants average about 35% SOV, with top performers around 72%. Supplements average 28%. Marine and boating sits at 22%. If you're below the average, you know exactly how far behind you are.
How we keep it under 90 seconds
Four AI platforms, eight queries each. That's 32 API calls. If we ran them one at a time, you'd be waiting 10 minutes.
Instead, we query all four platforms in parallel for each question. ChatGPT, Claude, Gemini, and Perplexity all get the same query at the same time. When the responses come back (each in a different format, because every AI company has its own ideas about API design), we normalize them into a consistent structure and scan for brand mentions.
While the audit runs, you watch a terminal-style animation. Each phase ticks through status lines as queries go out and results come in. You can see which query is being tested in real time. We wanted merchants to see the work happening, not stare at a loading spinner. When "Your Share of Voice: 0%" appears at the end, you already watched 32 queries come back empty. The score doesn't surprise you. It confirms what you just saw.
What we keep finding
We've been running audits for merchants across auto parts, supplements, marine, musical instruments, and industrial supplies. The pattern is consistent.
Shopify merchants show up. They have native ACP integration and their product data flows into ChatGPT automatically. BigCommerce merchants show up through Feedonomics. Everyone else — Magento, Miva, NetSuite, custom stacks — is a ghost.
A typical result for a mid-market merchant on Magento: 0% SOV across all four platforms, 8 queries tested, zero mentions. The top competitor (usually a Shopify-based retailer in the same vertical) comes back at 60-80%. That's not a gap. It's a wall.
The root cause is product data. These merchants have catalogs with thousands of SKUs, but the data isn't structured for AI. Missing fitment info, inconsistent units, pricing errors, certifications buried in free-text description fields. AI agents can't work with that. They skip it entirely and recommend whoever has clean data.
Merchants expect a low score. They don't expect zero. The moment that lands is when they expand a query result and read what ChatGPT actually told the "customer" — and see only competitors listed. That's when the conversation shifts from "should we look into AI commerce" to "how fast can we fix this."
Why we made it free
Every audit costs us money. Four API calls per query, eight queries per audit, plus the Perplexity intelligence phase and email infrastructure. It's not trivial.
But the audit does its own selling. When a merchant sees zero visibility while their competitor has 75%, they don't need a slide deck explaining why AI commerce matters. The data speaks for itself.
We also designed it to require almost nothing from the merchant. Enter a URL. Enter an email. Pick your vertical. That's it. No account creation, no credit card, no demo booking. The less friction, the more audits run, the more merchants see the gap.
The tool can also be embedded as an iframe on partner sites, which lets agencies and consultants offer AI visibility audits to their clients under their own brand. Same engine, different wrapper.
What happens after the audit
The audit shows the problem. FlowBlinq fixes it.
We take the merchant's existing catalog — on whatever platform they're running — and transform it into structured, semantic product data. Verified fitment, clean specs, proper categorization. Then we connect it to every AI commerce protocol: OpenAI's ACP for ChatGPT, Google's UCP for Gemini, Anthropic's MCP for Claude.
Four weeks from audit to live. The merchant's engineering team spends 5-10 hours total. No re-platforming. Your Magento stays your Magento. We sit between your platform and the AI agents, translating your catalog into the format each protocol requires.
But that's a different post. This one's about the audit. And the audit is free.
Do you know what ChatGPT says when a customer asks about your products?
We'll run the queries for you. Four platforms, eight customer questions, 90 seconds. You might not like the answer, but you need to see it.
Related Articles
Strategy
The Hidden Advantage: Why Your Messy Data Is Actually an Opportunity
Your "messy" catalog might be your biggest advantage in the AI commerce race.
Analysis
Protocol Fragmentation: ACP, UCP, and MCP Explained
Why each AI platform has its own commerce protocol and what it means for merchants.
Case Study
A $100M Brand Picked Two Categories and Ignored the Rest
How a phased approach gets you live in 12 weeks instead of 12 months.
Run your free AI visibility audit
One URL. No signup. We'll show you exactly who AI agents recommend when your customers ask about your products.