Context
Improbable is a UK-based technology company with a £3B+ valuation, known for building simulation and synthetic environment platforms used by NATO, the UK Ministry of Defence, and Royal Navy. The engineering challenges at this scale (performance, reliability, security, multi-jurisdiction deployment) are among the hardest in the commercial technology sector.
Working in this environment as a platform engineer shaped the standards I bring to every technology engagement since.
The engineering environment
Building for defence and national security clients is a different discipline from building for commercial SaaS:
Performance is not negotiable. Simulation platforms need to process geospatial data, physics computations, and agent interactions at real-time speeds across distributed systems. The architecture has to be correct from the start. There is no "fix it later when it becomes a bottleneck."
Reliability has a different meaning at this scale. When a system is used for operational planning, training, and live mission support across 30+ countries, the concept of "acceptable downtime" changes fundamentally. This shaped how I think about infrastructure design, observability, and incident response.
Security is a first-class architectural concern. Not a layer added on top, not a compliance checklist. Security considerations influence every architectural decision: data flows, service boundaries, authentication models, audit logging. Working in a classified-adjacent environment makes this intuition permanent.
International deployment is its own engineering problem. A platform that operates across NATO member states has to handle regulatory variation, latency across geographic regions, data residency requirements, and the practical complexity of supporting teams in dozens of different operating contexts.
What was built
Platform engineering at Improbable meant building the foundation that application teams built on top of: service infrastructure, data pipelines, deployment tooling, and the reliability engineering that kept it all running at scale. The specifics of what was built remain appropriately general in this context, but the engineering principles that come from this environment are concrete and transferable.
What this teaches
Architecture decisions at day 1 determine what's possible at day 1,000. The organisations that get this right build something defensible. The ones that don't spend the next several years trying to retrofit quality into a system that wasn't designed for it.
Observability is not a feature. It is a prerequisite for production. You cannot manage a system you cannot see. In complex, distributed AI systems, which is what most serious AI products become, the ability to understand what the system is doing, why, and where it's failing is what separates a working production system from an expensive proof of concept.
The MLOps problem is real and underestimated. Getting a model to perform well in evaluation is one challenge. Getting it to perform consistently in production, across different input distributions, at scale, with an audit trail. That is an engineering challenge of a different order. Most AI projects underinvest here and pay for it later.
Vendor and platform lock-in has national-security-level consequences at scale. The decisions I made at Improbable about infrastructure abstraction and avoiding hard dependencies were informed by the requirement that the platform remain operable regardless of any single vendor's strategic direction. The same principle applies to any organisation choosing AI infrastructure. The landscape is moving fast, and the ability to adapt without rebuilding from scratch is a strategic advantage.
Why defence-grade engineering matters
Most organisations are not building national security infrastructure. But the engineering discipline that comes from working in environments where failure has real consequences, where architecture decisions are made carefully and their implications thought through, is exactly what's missing when AI projects fail to deliver.
The 80% of AI deployments that don't deliver material business impact aren't failing because the models are bad. They're failing because the infrastructure, the data architecture, and the production engineering were not built with the same rigour that the development phase was.
I bring that rigour to every engagement.
Related: Agentic AI in 2026: what actually works in production · Technical Due Diligence