Foreword
Cassandra was open-sourced by Facebook in July 2008. This original version of Cassandra was written primarily by an ex-employee from Amazon and one from Microsoft. It was strongly influenced by Dynamo, Amazon’s pioneering distributed key/value database. Cassandra implements a Dynamo-style replication model with no single point of failure, but adds a more powerful “column family” data model.
I became involved in December of that year, when Rackspace asked me to build them a scalable database. This was good timing, because all of today’s important open source scalable databases were available for evaluation. Despite initially having only a single major use case, Cassandra’s underlying architecture was the strongest, and I directed my efforts toward improving the code and building a community.
Cassandra was accepted into the Apache Incubator, and by the time it graduated in March 2010, it had become a true open source success story, with committers from Rackspace, Digg, Twitter, and other companies that wouldn’t have written their own database from scratch, but together built something important.
Today’s Cassandra is much more than the early system that powered (and still powers) Facebook’s inbox search; it has become “the hands-down winner for transaction processing performance,” to quote Tony Bain, with a deserved reputation for reliability and performance at scale.
As Cassandra matured and began attracting more mainstream users, it became clear that there was a need for commercial ...