Kolton Andrus explores the evolution of chaos engineering and explains why it’s becoming the go-to approach for building resilient systems.
Kolton is an engineer on Netflix’s Edge Platform team which is responsible for the reliability and performance of their externally facing services. He designed and built the next generation failure injection service within Netflix: "FIT":http://techblog.netflix.com/2014/10/fit-failure-injection-testing.html, which simplifies ad-hoc and automated failure testing at key inflection points within the Netflix service graph. Prior to Netflix, Kolton was an engineer on Amazon’s Retail Website Availability team where he had the opportunity to design and build ‘Gremlin’, Amazon’s failure injection service. He also served as an engineer and manager of the Retail Website Latency team, tasked with improving customer facing performance. He serves as a ‘Call Leader’, directly managing large scale customer facing incidents; responsible for diagnosis, decision making, and resolution of these events. He had the privilege of likewise serving at Amazon.