Sharding is the process of fragmenting the data across different locations. This enables us to split a large dataset into different servers, each one with a smaller set. This is comparable to a RAID 0 configuration, where, for example, you may need a 10 TB disk and do this by combining two 5 TB disks:
Doing so is considered scaling, but it comes at a cost. Each piece of data is important; it's not a copy, it's a fundamental part. Sharding adds complexity to your deployment but sometimes is inevitable if you have a very large dataset.