O'Reilly logo

Virtualizing Hadoop: How to Install, Deploy, and Optimize Hadoop in a Virtualized Architecture by George J. Trujillo Jr., Justin Murray, Rommel Garcia, Steven Jones, Charles Kim

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 6. Hadoop SQL Engines

Data is the new oil. No: Data is the new soil.

—David McCandless

One of the biggest decisions in the design of a Hadoop ecosystem is selecting the SQL engines for the use cases. You have to ask yourself, for different types of applications and projects, should we use Hive on Tez, Impala, Spark SQL, Phoenix for HBase, and so on? The decision gets harder as each new release adds functionality that overlaps other SQL engines. In this chapter we discuss Hadoop SQL engines and two of the primary tools that use these engines, Hive and Pig.

Where SQL Was Born

In the early days of computing, everything was file based and only geeks could parse and process such data. With RDBMSs, SQL became the universal language of data ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required