Programmable 2026 Presentation
Designed To Fail: Building Resilient Applications
An unarguable truth in software is that, at some point, failure is inevitable. Code breaks, servers crash, the power goes out – but building software that is designed to fail gracefully isn't always front-of-mind.
In this talk, we’ll walk through the process of designing of a resilient, real-world system; exploring how to handle failure at every layer. From redundancy and circuit breakers, to handling automated full or partial failover, we’ll cover practical techniques for building robust systems that keep working when things go wrong. As complexity grows, we’ll examine how architectural choices affect failure modes, and how to mitigate common tradeoffs like latency, data consistency, and degraded functionality.
We'll also explore what it takes to build reliability into team processes & mindset; identifying critical requirements and dependencies, to designing with user expectations in mind. Finally we’ll highlight the role of observability via health checks and OpenTelemetry, and automated tools for stress testing our systems, in helping teams make fast, informed decisions under pressure.
You will leave armed with practical tips, tools, and techniques to make sure your systems are designed to fail (gracefully)... and crucially recover!