Recycling deleted data from trash to HDFS
In this recipe, we are going to see how to recover deleted data from the trash to HDFS.
Getting ready
To perform this recipe, you should already have a running Hadoop cluster.
How to do it...
To recover accidently deleted data from HDFS, we first need to enable the trash folder, which is not enabled by default in HDFS. This can be achieved by adding the following property to core-site.xml
:
<property> <name>fs.trash.interval</name> <value>120</value> </property>
Then, restart the HDFS daemons:
/usr/local/hadoop/sbin/stop-dfs.sh /usr/local/hadoop/sbin/start-dfs.sh
This will set the deleted file retention to 120 minutes.
Now, let's try to delete a file from HDFS:
hadoop fs -rmr /LICENSE.txt 15/10/30 10:26:26 INFO ...
Get Hadoop Real-World Solutions Cookbook - Second Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.