O'Reilly logo

Linux Shell Scripting Cookbook by Sarath Lakshman

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Finding and deleting duplicate files

Duplicate files are copies of the same files. In some circumstances, we may need to remove duplicate files and keep a single copy of them. Identification of duplicate files by looking at the file content is an interesting task. It can be done using a combination of shell utilities. This recipe deals with finding out duplicate files and performing operations based on the result.

Getting ready

Duplicate files are files with different names but same data. We can identify the duplicate files by comparing the file content. Checksums are calculated by looking at the file contents. Since files with exactly the same content will produce duplicate checksum values, we can use this to remove duplicate lines.

How to do it... ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required