Summary

84% of organisations have AI tools, but adopting AI and adopting it well are different things. Adoption is buying licences and tracking logins, which produces 90% usage and no P&L movement. Adopting well means rebuilding the work around the tool, measuring outcomes rather than activity, and matching rollout speed to cultural change. MIT found 95% of enterprise generative AI produces no measurable profit, because most firms treat it as a procurement decision rather than an operating-model one. The companies getting real returns start from a business problem, redesign the workflow before introducing the tool, keep humans in the loop for judgement calls, and stage the rollout to the pace of change.

Here are four numbers that should not coexist:

  • 84% of developers now use AI coding tools
  • 29% trust the accuracy of what those tools produce
  • 10% is the average productivity gain, unchanged for a year
  • 55% of companies that laid off workers for AI now regret it

That's not an AI problem. That's an adoption problem. And it's the most expensive kind: the kind where you've already spent the money and aren't getting the return. It's also the most recoverable kind, because the fix doesn't need a bigger budget or a better model. It needs the return unlocked from spend you've already committed.

The procurement fallacy

Most organisations treat AI adoption as a procurement decision. Buy Copilot licences. Roll them out. Track adoption rates. Report to the board that 90% of engineers are using AI tools.

This is the equivalent of buying gym memberships for the entire company and measuring success by how many people scanned their badges at the door. You're measuring presence, not results.

MIT researchers found that 95% of enterprise generative AI deployments had no measurable impact on profit and loss. Not "small impact." No measurable impact. Forrester found that over half of companies that cut staff in the name of AI efficiency are now quietly rehiring, because it turns out that removing humans from processes that AI can't actually handle creates problems that are more expensive than the savings.

The founders and small to mid-market teams I work with that are genuinely getting value from AI didn't start by buying tools. They started by asking a different question entirely.

What "doing it badly" looks like

The pattern is remarkably consistent.

Stage 1: Tool deployment. The company buys AI coding tool licences. There's an announcement. Maybe a lunch-and-learn. Developers start using autocomplete suggestions.

Stage 2: Adoption theatre. Usage numbers go up. The team reports feeling more productive. Internal surveys are positive. The board gets a slide showing 85% adoption.

Stage 3: The invisible costs accumulate. Code review becomes cursory. Developers rubber-stamp AI suggestions they should scrutinise. Technical debt accumulates faster. CodeRabbit's analysis of 470 open-source repositories found AI-generated code contains 1.7x more bugs, 75% more logic errors, and 2.74x more security vulnerabilities. GitClear tracked a doubling of code churn (lines reverted or updated within two weeks) compared to the pre-AI baseline. Copy-pasted code increased 48%.

Stage 4: The reckoning. Defect rates climb. Production incidents increase. The productivity gains that everyone felt never materialise in the actual delivery metrics. But by now, the organisation has restructured around the assumption that AI is working. Unwinding is painful.

I've seen this play out at multiple organisations. The specific tools vary. The pattern doesn't.

What "doing it well" looks like

The companies getting genuine returns from AI share characteristics that have nothing to do with which model they're using or how much they're spending on compute.

They changed the workflow, not just the tooling

Deloitte's research shows that organisations taking a work-redesign approach, rethinking processes before selecting tools, are twice as likely to exceed ROI targets than those starting with the technology.

This means something concrete. It means you don't give developers AI tools and tell them to keep working the same way. You redesign how work flows through the team. At Anthropic, the Claude Code team maintains structured documentation files that tell the AI agent what the codebase conventions are, what mistakes to avoid, and how to structure changes. That's not a tool feature. It's a workflow decision. And it improves output quality by 2-3x.

At Shopify, Tobi Lutke's internal memo was explicit: "Reflexive AI usage is now a baseline expectation." But the meaningful part wasn't the mandate. It was the structural change. Teams must demonstrate why they cannot achieve what they need using AI before requesting additional headcount. That forces a genuine rethink of how work gets divided between humans and tools, rather than bolting AI onto the existing structure.

They measure outcomes, not activity

DX surveyed 121,000 developers across 450+ companies and found that the organisations seeing real gains aren't the ones with the highest adoption rates. They're the ones measuring business-level outcomes: cycle time, defect rate, time from commit to production, customer-facing incident frequency.

Some companies in the survey saw twice as many customer-facing incidents after AI adoption. Others saw a 50% reduction. The difference wasn't the tools. It was whether the organisation had built quality gates that matched the new workflow.

Laura Tacho, CTO at DX, put it precisely: "In struggling organisations, AI tends to highlight existing flaws rather than fix them."

They invested in agentic workflows, not just autocomplete

This is the biggest differentiator I see. The 19% slowdown that METR found in their study was developers using autocomplete-style tools on familiar codebases. The 5-10x gains that companies like StrongDM, Courier, and Anthropic report come from agentic workflows, where AI agents take on entire tasks, not just suggest the next line of code.

The shift from autocomplete to agentic isn't just a tool upgrade. It requires fundamentally different skills from the developer: the ability to decompose problems into well-scoped tasks, write clear specifications, evaluate output critically, and maintain architectural coherence across AI-generated code. That's a training investment, not a licence purchase.

They kept humans in the loop, deliberately

Duolingo made headlines with a policy that only permits hiring if work cannot be automated. It's a bold position, and I understand the logic. But the companies I've seen get the best results take a more nuanced approach: they define specifically which decisions AI makes autonomously, which require human review, and which remain entirely human.

This isn't governance for governance's sake. It's engineering. The insurance brokerage I worked with achieved 67% autonomous case resolution, but the 33% that required human handling was designed with as much care as the automated portion. The handoff included full context, the agent's reasoning, and the information it had retrieved. That's what made the whole system trustworthy enough to scale. The same principle held at NHS Wales, where the governance structures weren't optional. They were the foundation for a transformation programme that identified £20M+ in savings opportunities.

The Shopify paradox

Shopify is doing something interesting that illustrates the tension well. On one hand, Lutke's memo says prove AI can't do it before we hire a human. On the other hand, Shopify is hiring 1,000 interns specifically for "AI-native thinking."

That's not a contradiction. It's the right strategic move. They're reducing headcount for tasks AI genuinely handles well, while simultaneously investing in developing people who can work with AI effectively. They understand that the tool is only as good as the human directing it, and that developing that human capability requires deliberate investment.

Most organisations are doing only the first half: cutting costs. Very few are doing the second half: building capability. And the second half is where the long-term competitive advantage lives.

What I'd ask any engineering leader

If you're running an engineering organisation and you've deployed AI tools, three questions will tell you whether you're adopting well or just adopting:

Are you measuring outcomes or adoption? If your primary metric is "percentage of developers using AI tools," you're measuring the wrong thing. If you can point to specific improvements in cycle time, defect rate, or delivery throughput, and attribute them to AI, you're on the right track.

Have you changed the workflow or just added a tool? If developers are working the same way they did before, with AI suggestions layered on top, you're in the autocomplete trap. The real gains come from redesigning how work is structured, delegated, and reviewed.

Are you developing AI fluency or just providing AI access? The gap between developers who are slower with AI tools and developers who are 10x more productive is not about the tools. It's about skill, method, and workflow design. That requires investment: in training, in mentorship, in giving people time to develop new ways of working.

The good news is that the organisations doing this well are pulling ahead fast. The bad news is that most organisations haven't started. If you're not sure where your organisation falls, the AI readiness diagnostic will give you a clear picture in 15 minutes.


If you want to figure out where your AI investment is actually delivering and where it's theatre, get in touch.

Related: Turning AI theatre into AI that moves the numbers · How to unlock AI ROI: what the 20% do differently · How I approach AI transformation

Frequently asked questions

What is the difference between adopting AI and adopting it well?
Adoption is buying tools and tracking usage. Adopting well is rebuilding the work around the tool, measuring outcome metrics rather than activity, and matching the speed of rollout to the speed of cultural change. The first produces 90% adoption rates and no P&L movement. The second produces measurable, audited business impact.
Why does 95% of enterprise generative AI produce no measurable profit impact?
MIT's research found that most organisations treat AI as a procurement decision rather than an operating-model decision. They buy licences, measure logins, and skip the process redesign that actually creates value. Without redesigning the work, AI tools layer on top of existing inefficiencies and amplify them rather than removing them.
Why are companies that laid off staff for AI rehiring them?
Forrester reports that over half of companies that cut staff in the name of AI efficiency are now quietly rehiring. The pattern is the same in each case: leadership removed humans from processes that AI cannot actually handle end-to-end. The cost of the resulting errors, escalations, and rework exceeded the salary savings.
What does AI adoption theatre look like?
Four stages: tool deployment, adoption metrics rising, invisible costs accumulating (cursory code review, rubber-stamped suggestions, 1.7x more bugs, 75% more logic errors, 2.74x more security vulnerabilities per CodeRabbit research), then a reckoning when defect rates climb and the productivity gains never materialise in delivery metrics.
What separates the companies getting real AI returns?
They start from a business problem rather than a tool, they redesign the workflow before introducing the tool, they measure outcome metrics (cost per case, conversion rate, handling time) rather than activity, they keep humans in the loop for the cases that need judgement, and they stage the rollout against observed numbers rather than promised ones.
Stay ahead

AI & tech are moving fast.
Get the signal, not the noise

Ready to make AI actually work?

Tell me what you're working on. I'll respond personally. If there's a fit, we'll take it from there.

Accepting one new client · second slot opens Q3 2026