Every client is different: different stage, different technology maturity, different organisational culture, different definition of success. But after four companies, dozens of advisory engagements with founders and small to mid-market teams, and production AI systems across insurance, defence, healthcare, and enterprise software, I've developed a thinking structure that works.

This is not a proprietary framework with a trademarked name. It's the distilled approach of someone who has built, shipped, and scaled AI systems, and who knows the difference between what demos well and what works in production.

The three-arc engagement

Every engagement follows three arcs. The length of each varies, but the sequence doesn't.

01
Weeks 1–3

Diagnostic

Before I build anything, I need to understand what's actually happening, not what the slide deck says is happening.

  • Shadow AI audit. What AI tools are people actually using? Where is data flowing? What's sanctioned, what's unsanctioned, and what's the risk exposure?
  • Process mapping. Which processes are candidates for AI deployment? Where are the decision points? What's the current error rate, throughput, and cost per unit?
  • Data readiness. Is the data infrastructure ready for production AI? Not "do you have data," but can it be accessed, governed, and served to a production system at the reliability level required?
  • Team assessment. Does the organisation have the skills to sustain AI systems after I leave? Where are the gaps?

The diagnostic ends with a clear, honest document. Where you are, what's possible, what I'd prioritise. If the answer is "this isn't ready yet," I'll say so.

02
Weeks 3–8

Build and deploy

This is where most advisory engagements fail, because most advisors stop at the recommendation. I don't.

  • Decision delegation architecture. Precisely which decisions the AI system makes autonomously, which require human review, and where the escalation boundaries sit. Governance built into the system, not bolted on after.
  • Tight-scope deployment. One process. One set of KPIs. Full governance from day one. The goal is a production system delivering measurable results within weeks, not a pilot that "shows promise" for months.
  • Kill criteria. Before deployment, we define when we'd stop. If the system doesn't hit its KPIs within the agreed timeframe, we kill it and redirect the investment. No sunk-cost persistence.
  • Measurement from day one. Defined KPIs tracked from go-live. Not vanity metrics. Cost per case, resolution rate, time-to-close, revenue impact. Numbers that appear on a P&L.
03
Ongoing

Sustain and scale

A production system is only valuable if it stays in production and the organisation can evolve it without depending on me.

  • Knowledge transfer. The team understands the system architecture, the governance framework, and the monitoring approach. They can maintain and extend it.
  • Scaling roadmap. With one system in production and measured results, the 90-day action plan identifies the next two or three processes, prioritised by P&L impact.
  • Governance evolution. The framework grows with the AI deployment. New systems, new risk categories, new regulatory requirements, all incorporated systematically.

Decision delegation architecture

This is the core intellectual framework behind every AI deployment I build. It answers the question most organisations get wrong: what should the AI decide, and what should a human decide?

Most organisations draw this line based on comfort, not analysis. They keep humans in the loop for everything because it feels safer, which means the AI system adds latency without reducing cost, and the productivity gains never materialise.

The decision delegation architecture maps every decision in a process and classifies it into one of three tiers:

Fully automated

The AI decides and acts. Error rate is within acceptable bounds, consequences of errors are manageable, and the volume justifies automation.

High volume · Low error cost

AI-recommended, human-approved

The AI proposes, a human confirms. For decisions where the error cost is high but the AI's recommendation quality is strong enough to save significant analysis time.

Mixed risk · Speed and oversight

Human-only

The AI provides context but the human decides. For decisions requiring judgement, empathy, or contextual knowledge the system can't access.

High-stakes · Judgement-led

The boundaries between these tiers are defined by data: error rates, cost of errors, volume, and regulatory requirements. Not by instinct. And they evolve as the system improves and the organisation builds confidence.

The five factors that decide success

The three arcs are how an engagement runs. But there is a layer beneath all of it, more foundational than any process, methodology, or SDLC, that decides how far the work can take you. After enough projects I am convinced the real drivers of success in a technology business are not technical at all. They sit a couple of levels above execution.

People

The calibre of your people, their skills, values, and attitudes, sets the ceiling on what you can achieve. Get the best you can, and get the most from the ones you already have.

Leadership

The force that lets people perform at their best, or quietly stops them. It sets direction, builds the culture, and makes the hard calls. It is the one factor that can turn a sinking business around.

Culture

Real and tangible, not soft. The shared values and norms that tell everyone how to cooperate, take risks, and share ideas. Slow to build, easy to damage, worth protecting fiercely.

Problem-solving

How you make consequential decisions under real uncertainty, and tell the fast, reversible ones from the slow, irreversible ones. It shapes every other choice you make.

Creativity

The human spark behind a genuinely new idea. As AI generates and remixes options endlessly, the scarce skill shifts from producing ideas to discerning which ones matter and framing the right problem.

These are not separate levers. They work in concert: strengthen one and the others gain, weaken one and you weaken all. There are no magic bullets, and process fixes only ever treat the symptoms. If that sounds daunting, start with baby steps: improve the factor that looks least daunting, and let the gains compound.

Building in the agentic era

The way software gets built has changed faster in the last year than in the previous ten. By mid-2026, agentic coding tools, Claude Code, OpenAI's Codex, Cursor, GitHub's coding agent, are mainstream. Industry surveys put weekly AI use among engineers at roughly 95%, with more than half now handing real work to autonomous agents, and most teams running several tools at once. I build with these every day. But the hype obscures what actually matters, so here is the honest read I bring to client work.

The bottleneck moved, from typing to deciding and verifying. Generation is cheap now. The scarce, valuable work is defining intent, setting boundaries, reviewing, and proving the system did what you wanted, not just what you literally asked for. The engineer's job is shifting from author to orchestrator and reviewer.

The engineer's job

Writing the code

Defining intent, setting boundaries, reviewing output

The question that matters

Did it do what I said?

Did it do what I actually wanted?

Estimation

Human effort in story points

Spec complexity, with agents as parallel workstreams

Cycle length

Sprints measured in weeks

Cycles in hours, on governed rails

The new lifecycle models are variations on one theme. AI-DLC, spec-driven development, and "agentic delivery" all front-load structured intent and wrap non-deterministic agents in governance. The useful part, writing a clear specification before you let an agent run, is real, and I use it. The overreach, treating the spec as the only artefact a human ever touches, is unproven and looks a lot like waterfall in a new coat. I stay on the proven side of that line.

The heavy process becomes the bottleneck. When working software can appear in minutes, two-week sprints, effort-based estimation, and a calendar full of ceremonies are slower than the engineering they are meant to coordinate. So I strip back to the adaptive core of agile and run delivery a particular way: the people who make the decisions in the same room every day, sharing context and deciding on the spot, so the team behaves like one fast, cross-functional organism rather than a relay of handoffs. Delivery is daily. The longest planning horizon is a week. We course-correct every day, on the foundations the section above demands, not instead of them.

Speed exposes weakness. The most credible evidence, Google's DORA research, is blunt: AI lifts delivery throughput but damages stability unless you already have strong testing, version control, and platform quality. AI amplifies what is already there. A team on weak foundations simply ships its mistakes faster. This is why the control layer, scoped permissions, evaluations, audit trail, and rollback, comes first. It is the decision delegation architecture above, applied to coding agents. And agents need bounded ownership for the same reason a stream-aligned team does: point one across fuzzy boundaries and it will blur them further.

Smaller and faster, with one honest caveat. Work that needed a specialist team in 2023 can often be done by a small, AI-augmented team in 2026, and that leverage is real. But the caveat, from Gartner, matters: most organisations cutting headcount on the strength of AI are not yet seeing the return. "Agentic era" is too often cover for a layoff. Smaller and faster only pays when the foundations are in place.

And that is the thread back to everything above. The faster the mechanical work is automated, the more the outcome is decided by the human factors: judgement, the right problem, the right people, and the leadership to point them well. The tools will keep changing every few months. The layer underneath does not, and that is the layer I help clients get right.

What this is not

  • It's not a template. Two clients with the same industry and similar size will get different approaches, because their data maturity, team capability, and strategic priorities are different.
  • It's not a maturity model. I'm not going to assess you against a 5-level framework and sell you a roadmap to level 5. I'm going to build something that works and measure whether it delivers.
  • It's not vendor-neutral consulting. It's implementation. I work alongside your team, in your systems, with your data. The deliverable is a working system, not a recommendation to build one.

If you want to see how this would apply to your situation, let's talk. A 30-minute conversation is usually enough to tell whether there's a fit and what the first 90 days would look like.

Ready to make AI actually work?

Tell me what you're working on. I'll respond personally. If there's a fit, we'll take it from there.

Accepting one new client · second slot opens Q3 2026