May 2018
Beginner to intermediate
482 pages
11h 42m
English
EC is a key change in Hadoop 3.x promising a significant improvement in HDFS utilization efficiencies as compared to earlier versions where replication factor of 3 for instance caused immense wastage of precious cluster file system for all kinds of data no matter what the relative importance was to the tasks at hand.
EC can be setup using policies and assigning the policies to directories in HDFS. For this, HDFS provides an ec subcommand to perform administrative commands related to EC:
hdfs ec [generic options] [-setPolicy -path <path> [-policy <policyName>] [-replicate]] [-getPolicy -path <path>] [-unsetPolicy -path <path>] [-listPolicies] [-addPolicies -policyFile <file>] [-listCodecs] [-enablePolicy -policy <policyName>] ...
Read now
Unlock full access