The Apache Hadoop Platform
If you are new to Hadoop, you probably have been overwhelmed by the sheer number of different utilities, processes, and configuration files in the Hadoop ecosystem. It’s very easy to lose sight of the core Hadoop platform and its significance as a unique, extremely powerful, and—believe it or not—uncomplicated distributed compute and storage platform. A lot of us tried out Hadoop for the first time in late 2007. Grid computing systems and database systems on commodity hardware were already becoming commonplace.
In those days we hadn’t yet heard the term Big Data, but building systems that could scale across ...

Get Oracle Big Data Handbook now with the O’Reilly learning platform.

O’Reilly members experience live online training, plus books, videos, and digital content from nearly 200 publishers.