In this talk, Cail will dive into an incident that occurred in Octopus Deploy's internal build platform, where thousands of ephemeral test environment pods jammed up their test cluster for seemingly no reason at all. We'll talk about incident analysis process, the specific contributing factors identified, and the ongoing challenges of the work involved in providing internal tooling.