The Support Mirage

30th January 2026 | Insights & Case Studies

Why AI systems fail when they trust user labels

A mid-market construction firm with £79m turnover and 297 staff faced a rebellion from its site managers. The complaints were constant and identical: ‘Connectivity’.

For six months, the IT Director tracked the metric. ‘Connectivity’ was the #1 tag in the helpdesk system, appearing in 200+ tickets. The signal seemed undeniable: the sites needed more bandwidth.

Leadership approved an £83k infrastructure upgrade. They deployed satellite internet and 5G boosters to all 15 active portacabins. It was expensive, high-spec and fast.

Two months later, the tickets had not stopped. Site managers were still furious. The hardware had not fixed the problem.

When the team finally sent a business analyst to a site, the reality became clear in ten minutes. The managers were not complaining about bandwidth speed. They were complaining that the document management app timed out whenever they tried to upload 500 photos in one go.

The tickets said “Can’t get upload to work, connection keeps dropping” and “Internet fails halfway through”. The keyword ‘Connectivity’ was accurate. The diagnosis was wrong. IT had thrown hardware at a software workflow problem.

The Cost: £83k in hardware leases, six months of operational drag, and a site team that stopped trusting IT to listen.
The Thesis: Users describe the fix they want. They rarely describe the failure mode. That gap creates expensive waste.

The Diagnosis

The team fell into The Support Mirage.

When people analyse 200+ requests, they look for shortcuts. They group everything with the word ‘slow’, ‘wifi’ or ‘internet’ into a single bucket. That flattens distinct operational struggles into a generic label.

The approach felt rigorous. The team had data, volume metrics, and a clear signal. ‘Connectivity’ appeared 200+ times. Leadership could see the pattern in a dashboard. The decision to upgrade infrastructure looked evidence-based, not reactive. But volume is not validation. The keyword count proved people were frustrated. It did not prove what they were frustrated about.

The IT Director prioritised based on the volume of the keyword, not the nature of the block. A label is not a diagnosis. Your job is to capture action plus failure, not category. The root cause was not the network. It was a lack of interpretation.

The Solution

Phase 1: The Interpreter Agent

Stop counting keywords. Use an agent to translate user-speak into operational blocks.

Constraint: Analyse the struggle, not just the request.

Operational workflow

Export: Operations Manager exports tickets to CSV (Friday afternoon).
Process: Feed the ‘Description’ column into the Interpreter Agent.
Guardrail: If confidence is low (<0.7), route the item to a manual review bucket. Do not guess.

Interpreter system prompt

You are an Operations Interpreter.

Your goal is to translate user requests into “Operational Blocks”.
Input: Raw ticket text.Output: A structured analysis of the underlying failure.
Rules:
1. Note what the user thinks is wrong (their diagnosis).
2. Look for the action they were trying to take.
3. Identify where the workflow actually broke.
4. Compare: Is their diagnosis accurate or a symptom?
5. Extract 2 short quotes as evidence.
6. If evidence is missing, state what you would ask next.

Format your output as a JSON object.
{
“id”:
“TICKET-123”,
“user_diagnosis”: “Connection keeps dropping”,
“user_action”: “Uploading photos”,
“failure_point”: “App timeout during upload”,
“diagnosis_accuracy”: “Symptom, not cause”,
“evidence_quotes”: [“it spins for 10 mins”, “fails at 99%”],
“confidence_score”: 0.9,
“unknowns”: null}

Phase 2: The Cluster Agent

People struggle to spot workflow patterns across 200+ requests. Agents do this well.

Constraint: Group by root cause and assign an owner.

Operational workflow

Feed: Input the JSON from Phase 1 into the Cluster Agent.
Synthesis: The agent groups items by root cause and summarises each block.
Output: A prioritised list of operational blocks, with owners and next steps.

Cluster system prompt

You are a Pattern Recognition agent.

Input: List of interpreted tickets.Task: Group these into “Operational Blocks”.

Rules:
1. Group by root cause, not the user’s department or diagnosis.
2. Assign an OWNER from (IT, Vendor, Training, Ops, Finance Ops, RevOps, People Ops, CX Ops).
3. Score IMPACT (1-5) based on frequency and severity.
4. Score EFFORT (1-5) to fix.
5. Flag items where user diagnosis was accurate (genuine infrastructure issues).

Output format:## Cluster Name (Count | Impact: X/5 | Effort: Y/5)

The Operational Block: [One sentence summary]
Owner: [Department/Role]
Recommended next step: [One line action]
Evidence:
– “[Quote 1]”
– “[Quote 2]”

Sample output

Cluster A: Photo upload timeouts (114 tickets | Impact: 5/5 | Effort: 2/5)
The Operational Block: The app cannot complete large uploads reliably on 4G because it lacks background sync or offline handling.
Owner: Software vendor or Product Manager
Recommended next step: Enable offline mode or background sync, or change the upload workflow to queue in the background.
Evidence:
“It spins for 10 minutes”
“Fails at 99%”

Cluster B: Genuine connectivity failures (47 tickets | Impact: 3/5 | Effort: 4/5)
The Operational Block: Sites 3, 7, and 12 experience signal drops during high-usage periods.
Owner: IT Infrastructure or Network Provider
Recommended next step: Deploy signal monitoring and investigate provider SLAs for affected sites.

Cluster C: Training gaps and user error (39 tickets | Impact: 2/5 | Effort: 1/5)
The Operational Block: Users are not compressing files before upload or using the wrong upload method.
Owner: Training or Site Operations
Recommended next step: Create quick reference guide and run 15-minute site training sessions.

The team realises that 57% of the ‘Connectivity’ issues are a software workflow issue, not a bandwidth issue. Another 23% are genuine infrastructure problems that the upgrade might have actually helped. The remainder are training gaps that no hardware or software change would fix.

Phase 3: The Validation Protocol

Validate the block before buying the fix.

Constraint: No spend approval until the failure is reproduced and logged.

Operational workflow

Technical verify: Reproduce the reported failure mode (e.g. upload 500 photos on a 4G connection). Confirm it fails as described and capture evidence (logs, timestamps, error states).
User verify: Email 5 users from the top cluster and ask: “If the app continued uploading photos in the background while you did other work, would that solve your issue?”

If the failure reproduces and 5/5 users confirm the fix, you solve the problem without additional hardware spend.

Executive Takeaway

Teams ask for resources (people, budget, hardware), but they usually need friction removed. Manual categorisation fails because it groups requests by the user’s label (‘Connectivity’) rather than the failure mode (‘Upload timeout’). That triggers expensive, reflex spending on symptoms. Use agents to decode the action the user was taking when it failed, then group by root cause so you fix the workflow, not the resource gap.

The Support Mirage

Why AI systems fail when they trust user labels

The Diagnosis

The Solution

Phase 1: The Interpreter Agent

Phase 2: The Cluster Agent

Phase 3: The Validation Protocol

Executive Takeaway

Our latest operator insights

The orchestrator: Automating the global supply chain with Josh Clement-Sutcliffe, Zencargo

When AI meets operational chaos, the happy path is off a cliff

What broke when an agent moved from pilot to production

Don’t miss all the latest practical AI insights