ZooKeeper was designed not only to be a great building block for developers, but also to be friendly for operations people. As distributed systems get bigger, managing operations becomes harder and robust administration practices become more important. Our vision was that ZooKeeper would be a standard distributed system component that an operations team could learn and manage well. We have seen from previous examples that a ZooKeeper server is easy to start up, but there are many knobs and dials to keep in mind when running a ZooKeeper service. Our goal in this chapter is to get you familiar and comfortable with the management tools available for running ZooKeeper.
In order for a ZooKeeper service to function correctly, it must be configured correctly. The distributed computing foundation upon which ZooKeeper is based works only when required operating conditions are met. For example, all ZooKeeper voting servers must have the same configuration. It has been our experience that improper or inconsistent configuration is the primary source of operational problems.
A simple example of one such problem happened in the early days of ZooKeeper. A team of early users had written their application around ZooKeeper, tested it thoroughly, and then pushed it to production. Even in the early days, ZooKeeper was easy to work with and deploy, so this group pushed their ZooKeeper service and application into production without ever talking to us.
Shortly after the ...