CAP theorem

As discussed previously, distributed data stores allow us to store huge volumes of data while providing the ability to horizontally scale as a single logical unit at all times. Inherent to many distributed data stores are the following features:

  • Consistency refers to the guarantee that every client has the same view of the data. In practice, this means that a read request to any node in the cluster should return the results of the most recent successful write request. Immediate consistency refers to the guarantee that the most recent successful write request should be immediately available to any client.
  • Availability refers to the guarantee that the system responds to every request made by a client, whether that request was successful ...

Get Machine Learning with Apache Spark Quick Start Guide now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.