July 2017
Intermediate to advanced
796 pages
18h 55m
English
To run your Spark jobs using Hadoop, it needs to have the data and the log directories with various permissions. You can use the following command:
$ mkdir -p /var/data/hadoop/hdfs/nn$ mkdir -p /var/data/hadoop/hdfs/snn$ mkdir -p /var/data/hadoop/hdfs/dn$ chown hdfs:hadoop /var/data/hadoop/hdfs –R$ mkdir -p /var/log/hadoop/yarn$ chown yarn:hadoop /var/log/hadoop/yarn -R
Now you need to create the log directory where YARN is installed and then set the owner and group as follows:
$ cd /opt/yarn/hadoop-2.7.3$ mkdir logs$ chmod g+w logs$ chown yarn:hadoop . -R
Read now
Unlock full access