Chapter 3. Expecting failure: fault tolerance in CoreOS

This chapter covers

  • Monitoring and fault tolerance in CoreOS
  • Getting your first complex service running
  • Application architecture in the context of CoreOS

If you work in infrastructure or operations in any capacity, you’ll understand the importance of monitoring systems. When the alarms go off, it’s time to figure out what’s happened. You might have also taken a crack at automating some of the most common fixes to problems or mitigated situations with disaster-recovery failover switches, multicasting, or a variety of other ways to react to failure. You probably also have an understanding that technology always finds a way to break. Hardware, software, connectivity, power grid—these ...

Get CoreOS in Action now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.