This chapter continues our discussion of HiveQL, the Hive query language, focusing on the data manipulation language parts that are used to put data into tables and to extract data from tables to the filesystem.
This chapter uses
SELECT ... WHERE
clauses extensively when we discuss populating tables with data queried from
other tables. So, why aren’t we covering
WHERE clauses first, instead of waiting until the next chapter,
Since we just finished discussing how to create tables, we wanted to cover the next obvious topic: how to get data into these tables so you’ll have something to query! We assume you already understand the basics of SQL, so these clauses won’t be new to you. If they are, please refer to Chapter 6 for details.
Since Hive has no row-level insert, update, and delete operations, the only way to put data into an table is to use one of the “bulk” load operations. Or you can just write files in the correct directories by other means.
We saw an example of how to load data into a managed table in Partitioned, Managed Tables, which we repeat here with an
addition, the use of the
This command will first create the directory for the partition, if it doesn’t already exist, then copy the data to it.
If the target table is not partitioned, you ...