We have created index with one replica. It means Elasticsearch will create one copy (replica) of each shard and place each replica on separate data node other than the shard from which it is copied. So, now there are two shards, primary shard (the original shard) and replica shard (the copy of the primary shard). During a high volume of search activity, Elasticsearch can provide query results either from primary shards or from replica shards placed on different data nodes. This is how Elasticsearch increases the query throughput because each search query may go to different data nodes.
In the summary, both, primary shards and replica shards provide horizontal scalability and throughput. It scales out your search volume/throughput ...