May 2017
Beginner to intermediate
596 pages
15h 2m
English
Sqoop does provide a command line option to import selected columns into HDFS. This be done either by using the --columns option or by using free form query capability of Sqoop. Both of these variations are shown as follows:
bin/sqoop import --connect jdbc:postgresql://<DB_SERVER_ADDRESS>/<DB_NAME>?schema=<SCHEMA> --table customer --columns id,first_name,last_name --m 1 --username <DB_USER_NAME> --password <DB_PASSWORD> --as-avrodatafile --append
sqoop import --query 'SELECT c.*, a.* FROM customer c JOIN address a on (c.id == a.id) WHERE $CONDITIONS' -m 1 --target-dir /user/foo/joinresults