In this recipe, we will look at how we can recover deleted files from the Hadoop cluster. What if the user deletes a critical file with the
-skipTrash option? Can it be recovered?
This recipe, is more of a best effort to restore the files after deletion. When the delete command is executed, the Namenode updates its metadata in
edits file and then fires the
invalidate command to remove the blocks. If the cluster is very busy, the invalidation might take time and we can revoke the files. But, on an idle cluster, if we delete the files, Namenode will immediately fire the invalidate command in response to the Datanode heartbeat and as Datanode does not have any pending operations to do, it will delete the blocks.