If your AI pilot has failed, the first thing to know is that you are firmly in the majority, and that most failed pilots are recoverable. By recent counts, 95% of generative AI pilots produce no measurable return (MIT, 2025), and 42% of UK firms scrapped most of their AI initiatives in 2025, up from 17% the year before (S&P Global). The failure is rarely the technology. It is almost always something fixable in how the project was scoped, governed, and built. Here is how to work out what went wrong, and how to get the value you were promised.
It almost certainly wasn't the AI
When a pilot dies, the instinct is to blame the model. Wrong model, not smart enough, not ready yet. Almost always, that is not what happened. The predictable causes are the same handful every time: a use case chosen because it was exciting rather than valuable, data and retrieval quality that could not survive real inputs, no clear definition of success, and no plan for the unglamorous engineering between a demo and production. I have written separately about why AI projects fail and what the 20% that succeed do differently. For now the important point is simpler: a failed pilot is usually a fixable pilot.
How to recover a failed AI pilot
- Diagnose before you spend another pound. The most expensive mistake is to pour more money in the same direction before understanding why the first attempt failed. Start with an honest post-mortem of scope, data, metric, and engineering, not a new vendor.
- Separate the idea from the execution. A pilot can fail because the idea was never going to pay off, or because a good idea was built badly. These need opposite responses, so decide which one you are looking at before anything else.
- Find the real success metric. Many pilots fail because nobody agreed what success looked like in numbers. If you cannot say what the pilot was supposed to move on the P&L, that is the first thing to fix, not the model.
- Decide: salvage or restart. If the use case maps to genuine value and a clear metric, the execution can usually be rebuilt on the same foundation. If the use case never had a path to value, restart from the problem, not the prototype.
- Rebuild the unglamorous parts. Most of the value lives in the layers nobody demos: retrieval, data quality, error handling, escalation, and the workflow redesign around the model. That is where the recovery effort should go.
Who should run the recovery, and who shouldn't
Here is the part most people get wrong. The worst-placed people to recover your failed AI pilot are usually the ones who built it. Not because they are not capable, but because their incentives and their blind spots both point the wrong way. The firm that built the pilot has a commercial interest in selling you more of the same, and a natural reluctance to name the decisions that caused the failure, because they made them.
| The consultancy that built it | An independent fractional CTO | |
|---|---|---|
| Incentive | Sell more build hours | Get you to a working outcome |
| Blind spot | Owns the decisions that failed | No stake in the original build |
| Salvage-or-restart call | Biased toward "keep building" | Decided on the merits |
| Accountability | To their own pipeline | To your result |
This is the same conflict of interest that runs through agency-supplied technology leadership. For a recovery, where the whole job is an unflinching diagnosis, independence is not a nice-to-have. It is the point.
What recovery actually looks like
None of this is exotic. The European insurance brokerage I worked with reached 67% autonomous resolution on real customer cases, not because of a cleverer model than anyone else had, but because the unglamorous parts were done properly: retrieval engineered as a first-class problem, conservative escalation, a metric everyone agreed on, and the workflow rebuilt around the system rather than bolted on top. A recovered pilot looks the same. It is the demo plus all the work the demo skipped.
A dead pilot feels like a write-off, and the board will be tempted to treat it as one. Usually it is not. It is a first draft that taught you, at some expense, exactly which of the four things above went wrong. That is worth more than it feels like in the moment, as long as the diagnosis is honest and the recovery is run by someone whose only interest is your result.
If you have an AI pilot that stalled or failed and you want an independent read on whether to salvage it or start again, that is exactly the kind of diagnosis I do. Let's talk.
Related: How to unlock AI ROI: what the 20% do differently · The hidden conflict of interest in hiring a fractional CTO · Stop counting AI use cases. Redesign three value streams instead