May 2017
Beginner to intermediate
596 pages
15h 2m
English
The following configuration may be required to be added in ${HADOOP_HOME}/etc/hadoop/core-site.xml so that hue can impersonate the user creating the Parquet file:
<property> <name>hadoop.proxyuser.hue.hosts</name> <value>*</value></property><property> <name>hadoop.proxyuser.hue.groups</name> <value>*</value></property>
Once the preceding configurations are added, we will need to restart the dfs service with the following command:
${HADOOP_HOME}/sbin/stop-dfs.sh && ${HADOOP_HOME}/sbin/start-dfs.sh
Import the customer records from database to Hadoop RAW storage using Sqoop job which would write the data in Parquet format, using the following command:
${SQOOP_HOME}/bin/sqoop import --connect jdbc:postgresql://<DB_SERVER_ADDRESS>/sourcedb?schema=public ...