Summary

After going through this chapter, we are now able to understand why and when to use big data instead of a traditional relational database. We also understand the difference between batch processing, real-time processing, and stream processing. We got familiar with the Hadoop ecosystem, especially Hive. We have also gone back in time and brushed through the history of database and warehouse to big data along with some big data terms, the Hadoop ecosystem, Hive architecture, and the advantage of using Hive. In the next chapter, we will practice setting up Hive and all the tools needed to get started using Hive in the command line.

Get Apache Hive Essentials now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.