Is the percentage of AI-generated code a useful engineering metric?

No. It measures activity, not outcomes. It tells you nothing about cycle time, defect rate, customer impact, or revenue. It is the engineering equivalent of measuring sales effectiveness by counting emails sent. Treat it as a vanity metric and replace it with outcome metrics.

What engineering metrics should a board actually be asking for in 2026?

Five outcome metrics: cycle time from commit to production, defect escape rate, customer-impacting incidents, revenue per engineer (or value delivered per engineer for non-revenue teams), and team health. These measure whether engineering is delivering more value, which is the question the AI investment is supposed to answer.

How do you tell whether an engineering team is using AI well?

Look at the outcome metrics, not the activity ones. A team using AI well shows shorter cycle times, lower or stable defect rates, increasing throughput on outcome-bearing work, and stable or improving team health. A team using AI badly shows higher code volume, more PRs, more bugs, more churn, and increasing rework. Both can hit 80%+ AI-generated code.

75% of Google's new code is AI-generated. So what?

Q: Why does the AI-generated code percentage get reported so often?

Because it goes up over time, sounds impressive on earnings calls, and gives CFOs a quantified-feeling answer to 'is the AI investment working?' Engineering leaders, sensing the pressure, are happy to provide one. Both sides are optimising for storytelling rather than outcomes.

Q: What is changing in the software development lifecycle in 2026?

The whole SDLC is being rewired, not just the typing. Coding agents are taking over routine implementation but they are also reshaping requirements, design, code review, testing, migration, deployment, and incident response. The interesting metric is end-to-end cycle time, not how much code is being generated.

At Google Cloud Next 2026, Sundar Pichai announced that 75% of Google's new code is now AI-generated and approved by engineers, up from 50% last autumn. The number was repeated by every AI newsletter, every industry analyst, and a quarter of the LinkedIn posts I saw that week.

Here is the inconvenient question almost nobody asked. So what?

What does that number actually tell you about Google's engineering output? Does it ship more product? Does the product have fewer defects? Is the customer experience better? Did the engineering organisation get cheaper to operate, or more expensive? Is morale up or down? Are good engineers staying or leaving?

The 75% number does not answer any of those questions. It is a measurement of inputs, presented as if it were a measurement of outcomes. And it is about to become the most reported, least useful metric in enterprise engineering.

Why this metric is irresistible to executives

I understand the appeal. AI-generated code percentage is a single number that goes up over time, sounds impressive on an earnings call, and feels like progress. CFOs and boards have been waiting for someone to give them a quantified answer to "is the AI investment working?" Engineering leaders, sensing the pressure, are happy to provide one.

The trouble is that this number measures the activity, not the value. It is the engineering equivalent of measuring sales effectiveness by counting how many emails the team sent. There is a correlation, on average, between effort and outcome. There is not a useful one between AI-generated lines and shipped value.

I have already seen this metric inverted in real organisations. A team I worked with reported that 80% of their merged code was AI-generated, and used it to justify a larger AI tooling budget. When I looked at their actual delivery metrics, cycle time had increased. Defect escape rate had doubled. The team was generating more code, merging more PRs, and producing more bugs, all of which were now arriving faster. The percentage was up. The product was getting worse.

What's actually changing in the SDLC

The interesting shift over the last year is not how much code is being written. It is how engineering work itself is being reshaped. Coding agents are taking over routine implementation, but they are also reshaping requirements, design, code review, testing, migration, deployment, and incident response. The whole software development lifecycle is being rewired, not just the typing.

OpenAI's enterprise update said Codex hit 3 million weekly active users and that customers including GitHub, Nextdoor, Notion, and Wonderful are building multi-agent systems that execute engineering work end-to-end. Microsoft reported roughly 140,000 organisations using GitHub Copilot, with enterprise subscribers nearly tripling year on year. Anthropic positioned Claude Code for Enterprise as a coding agent that writes, debugs, refactors, creates tests, opens PRs, and works across terminal, IDE, Slack, and web, with enterprise controls including permissions, OpenTelemetry monitoring, token visibility, SSO, SCIM, and audit. Salesforce went further, exposing 60+ MCP tools and 30+ coding skills through Headless 360, claiming up to 40% cycle time reduction through their DevOps Center MCP.

If you read those announcements carefully, the language has shifted. Vendors are no longer selling "AI that writes code faster." They are selling auditable agents that execute engineering work with controls. The metric the vendors themselves are now optimising for is not "lines generated." It is "engineering work completed safely."

That is the metric the board should be asking for.

The five questions to bring to your next board meeting

If you sit on a board, run a PE-backed business, or chair an audit committee where engineering reports up, these are the five questions to ask your CTO about AI in engineering. They cost nothing to ask. They reveal a great deal.

What is your release frequency, and how has it changed? How often does code reach production? Daily, weekly, monthly? Has it accelerated since AI tools were rolled out, or stayed flat? The companies that get real value from AI in engineering ship more often, not just produce more code.

What is your defect escape rate? What percentage of bugs reach production before being caught? If this number is rising in step with AI-generated code, you are paying for velocity in customer pain. This is the number to watch most carefully.

What is the PR review latency? When code is written, how long does it sit waiting for a human to review and merge it? AI-generated code can pile up faster than reviewers can handle. The cycle time of code reaching production depends on review, not on writing.

What is the cost per accepted change? Add up the cost of AI tools, developer time, review time, testing, and incident remediation, then divide by the number of changes that successfully reach production. This is the most honest unit economics you can put on engineering AI. It should be going down. In poorly managed deployments, it goes up.

What is the mean time to recovery? When something breaks, how fast does the team detect it, diagnose it, and recover? AI tools can both help and hurt this. Help, by suggesting fixes faster. Hurt, by introducing changes the team did not write and does not fully understand.

These five questions are not exhaustive. They are the minimum set. None of them includes "what percentage of code is AI-generated," because that number does not appear in any of them.

The migration metric is the one to lead with

If I were a CTO going to a board with one number to demonstrate engineering value from AI, it would not be code-generation percentage. It would be migration throughput.

Legacy system migration is a real cost in most established businesses. It is also the workload where AI agents are showing the largest, most measurable gains. Google reported that a complex internal migration was completed six times faster with agents and engineers working together. Anthropic's Claude Code Enterprise materials emphasise migration as one of the highest-value workloads. I have seen the same in my own work. Migration is the workload where the new tools produce orders-of-magnitude improvement, not single-digit ones.

If your engineering organisation can clear a 2-year migration backlog in 4 months, that is a board-relevant outcome. The percentage of code written by AI in the process is a footnote.

What I'd tell your CTO

If I were sitting across the table from your CTO this quarter, my advice would be:

Stop reporting AI-generated code percentages upward. It will land well in the first board meeting and badly in the third, when someone realises the number is rising and the product is not getting noticeably better.

Pick one engineering value stream and rebuild it with AI in the middle. Migration is the obvious choice. Security remediation is another (Google reported 90%+ mitigation time reduction in some workflows). PR review is a third. Pick one, redesign it, measure throughput and quality.

Make the metrics changes visible across the organisation. Cycle time, defect escape rate, review latency, cost per accepted change, MTTR. Put them on a dashboard. Show them at every engineering review. The team will optimise for what gets measured. Right now, most teams are not measuring the right things.

Spend the saved engineering time on the work that humans still do best. Hard problem decomposition, system architecture, code review at the level of taste, mentoring, judgment under uncertainty. The engineers who become valuable in 2027 are the ones who orchestrate well. I wrote about this last month in orchestrator, not 10x engineer.

The board reframe

For boards reading this, the simple test is: when your CTO talks about AI in engineering, do they lead with effort metrics or outcome metrics? If the lead number is "% of code AI-generated," you are getting an activity report. If the lead number is "release frequency up, defect rate down, migration throughput tripled, cost per accepted change halved," you are getting a value report.

The companies winning the engineering productivity race in 2026 are not the ones with the highest AI-generated code percentage. They are the ones whose boards refuse to accept that number as the answer.

If you'd like to talk through what your engineering AI metrics should look like, or how to redesign one engineering value stream around agents, get in touch. I've spent the last decade building engineering organisations that ship, and the last three years building them with AI in the middle.

75% of Google's new code is AI-generated. So what?

Why this metric is irresistible to executives

What's actually changing in the SDLC

The five questions to bring to your next board meeting

The migration metric is the one to lead with

What I'd tell your CTO

The board reframe

Frequently asked questions

Ready to make AI actually work?

75% of Google's new code is AI-generated. So what?

Why this metric is irresistible to executives

What's actually changing in the SDLC

The five questions to bring to your next board meeting

The migration metric is the one to lead with

What I'd tell your CTO

The board reframe

Frequently asked questions

AI & tech are moving fast.Get the signal, not the noise

Ready to make AI actually work?

AI & tech are moving fast.
Get the signal, not the noise