Chapter 1. Architecture and Data Model
Apache Accumulo is a highly scalable, distributed, open source data store modeled after Google’s Bigtable design. Accumulo is built to store up to trillions of data elements and keeps them organized so that users can perform fast lookups. Accumulo supports flexible data schemas and scales horizontally across thousands of machines. Applications built on Accumulo are capable of serving a large number of users and can process many requests per second, making Accumulo an ideal choice for terabyte- to petabyte-scale projects.
Recent Trends
Over the past few decades, several trends have driven the progress of data storage and processing systems. The first is that more data is being produced, at faster rates than ever before. The rate of available data is increasing so fast that more data was produced in the past few years than in all previous years. In recent years a huge amount of data has been produced by people for human consumption, and this amount is dwarfed by the amount of data produced by machines. These systems and devices promise to generate an enormous amount of data in the coming years. Merely storing this data can be a challenge, let alone organizing and processing it.
The second trend is that the cost of storage has dropped dramatically. Hard drives now store multiple terabytes for roughly the same price as gigabyte drives stored gigabytes of data a decade ago. Although computer memory is also falling in price, making it possible ...