Here are four numbers that should not coexist:
- 84% of developers now use AI coding tools
- 29% trust the accuracy of what those tools produce
- 10% is the average productivity gain, unchanged for a year
- 55% of companies that laid off workers for AI now regret it
That's not an AI problem. That's an adoption problem. And it's the most expensive kind: the kind where you've already spent the money and aren't getting the return.
The procurement fallacy
Most organisations treat AI adoption as a procurement decision. Buy Copilot licences. Roll them out. Track adoption rates. Report to the board that 90% of engineers are using AI tools.
This is the equivalent of buying gym memberships for the entire company and measuring success by how many people scanned their badges at the door. You're measuring presence, not results.
MIT researchers found that 95% of enterprise generative AI deployments had no measurable impact on profit and loss. Not "small impact." No measurable impact. Forrester found that over half of companies that cut staff in the name of AI efficiency are now quietly rehiring, because it turns out that removing humans from processes that AI can't actually handle creates problems that are more expensive than the savings.
The organisations I work with that are genuinely getting value from AI didn't start by buying tools. They started by asking a different question entirely.
What "doing it badly" looks like
The pattern is remarkably consistent.
Stage 1: Tool deployment. The company buys AI coding tool licences. There's an announcement. Maybe a lunch-and-learn. Developers start using autocomplete suggestions.
Stage 2: Adoption theatre. Usage numbers go up. The team reports feeling more productive. Internal surveys are positive. The board gets a slide showing 85% adoption.
Stage 3: The invisible costs accumulate. Code review becomes cursory. Developers rubber-stamp AI suggestions they should scrutinise. Technical debt accumulates faster. CodeRabbit's analysis of 470 open-source repositories found AI-generated code contains 1.7x more bugs, 75% more logic errors, and 2.74x more security vulnerabilities. GitClear tracked a doubling of code churn (lines reverted or updated within two weeks) compared to the pre-AI baseline. Copy-pasted code increased 48%.
Stage 4: The reckoning. Defect rates climb. Production incidents increase. The productivity gains that everyone felt never materialise in the actual delivery metrics. But by now, the organisation has restructured around the assumption that AI is working. Unwinding is painful.
I've seen this play out at multiple organisations. The specific tools vary. The pattern doesn't.
What "doing it well" looks like
The companies getting genuine returns from AI share characteristics that have nothing to do with which model they're using or how much they're spending on compute.
They changed the workflow, not just the tooling
Deloitte's research shows that organisations taking a work-redesign approach, rethinking processes before selecting tools, are twice as likely to exceed ROI targets than those starting with the technology.
This means something concrete. It means you don't give developers AI tools and tell them to keep working the same way. You redesign how work flows through the team. At Anthropic, the Claude Code team maintains structured documentation files that tell the AI agent what the codebase conventions are, what mistakes to avoid, and how to structure changes. That's not a tool feature. It's a workflow decision. And it improves output quality by 2-3x.
At Shopify, Tobi Lutke's internal memo was explicit: "Reflexive AI usage is now a baseline expectation." But the meaningful part wasn't the mandate. It was the structural change. Teams must demonstrate why they cannot achieve what they need using AI before requesting additional headcount. That forces a genuine rethink of how work gets divided between humans and tools, rather than bolting AI onto the existing structure.
They measure outcomes, not activity
DX surveyed 121,000 developers across 450+ companies and found that the organisations seeing real gains aren't the ones with the highest adoption rates. They're the ones measuring business-level outcomes: cycle time, defect rate, time from commit to production, customer-facing incident frequency.
Some companies in the survey saw twice as many customer-facing incidents after AI adoption. Others saw a 50% reduction. The difference wasn't the tools. It was whether the organisation had built quality gates that matched the new workflow.
Laura Tacho, CTO at DX, put it precisely: "In struggling organisations, AI tends to highlight existing flaws rather than fix them."
They invested in agentic workflows, not just autocomplete
This is the biggest differentiator I see. The 19% slowdown that METR found in their study was developers using autocomplete-style tools on familiar codebases. The 5-10x gains that companies like StrongDM, Courier, and Anthropic report come from agentic workflows, where AI agents take on entire tasks, not just suggest the next line of code.
The shift from autocomplete to agentic isn't just a tool upgrade. It requires fundamentally different skills from the developer: the ability to decompose problems into well-scoped tasks, write clear specifications, evaluate output critically, and maintain architectural coherence across AI-generated code. That's a training investment, not a licence purchase.
They kept humans in the loop, deliberately
Duolingo made headlines with a policy that only permits hiring if work cannot be automated. It's a bold position, and I understand the logic. But the companies I've seen get the best results take a more nuanced approach: they define specifically which decisions AI makes autonomously, which require human review, and which remain entirely human.
This isn't governance for governance's sake. It's engineering. The insurance brokerage I worked with achieved 67% autonomous case resolution, but the 33% that required human handling was designed with as much care as the automated portion. The handoff included full context, the agent's reasoning, and the information it had retrieved. That's what made the whole system trustworthy enough to scale.
The Shopify paradox
Shopify is doing something interesting that illustrates the tension well. On one hand, Lutke's memo says prove AI can't do it before we hire a human. On the other hand, Shopify is hiring 1,000 interns specifically for "AI-native thinking."
That's not a contradiction. It's the right strategic move. They're reducing headcount for tasks AI genuinely handles well, while simultaneously investing in developing people who can work with AI effectively. They understand that the tool is only as good as the human directing it, and that developing that human capability requires deliberate investment.
Most organisations are doing only the first half: cutting costs. Very few are doing the second half: building capability. And the second half is where the long-term competitive advantage lives.
What I'd ask any engineering leader
If you're running an engineering organisation and you've deployed AI tools, three questions will tell you whether you're adopting well or just adopting:
Are you measuring outcomes or adoption? If your primary metric is "percentage of developers using AI tools," you're measuring the wrong thing. If you can point to specific improvements in cycle time, defect rate, or delivery throughput, and attribute them to AI, you're on the right track.
Have you changed the workflow or just added a tool? If developers are working the same way they did before, with AI suggestions layered on top, you're in the autocomplete trap. The real gains come from redesigning how work is structured, delegated, and reviewed.
Are you developing AI fluency or just providing AI access? The gap between developers who are slower with AI tools and developers who are 10x more productive is not about the tools. It's about skill, method, and workflow design. That requires investment: in training, in mentorship, in giving people time to develop new ways of working.
The good news is that the organisations doing this well are pulling ahead fast. The bad news is that most organisations haven't started.
If you want to figure out where your AI investment is actually delivering and where it's theatre, get in touch.
Related: Most AI transformations are performance art · Why 80% of AI projects fail to deliver ROI · How I approach AI transformation