As the saying goes, you get what you pay for. When it comes to cloud providers, in general, the more you are willing to pay, the more resources you can have at your command. For a small price, you can provision some modest instances with a small amount of storage and use them for proof-of-concept work, small websites, or simple server hosting. On the other hand, if you have money to spend, you can employ the full range of compute and storage offerings from your cloud provider, which enable you to field entire enterprise-scale infrastructures—for a corresponding enterprise-scale price.
Fortunately, Hadoop was designed from the start not to require enterprise hardware, and it can run on a small handful of instances, at least to start with. Even in cloud deployments, it is not necessary to deploy the most powerful resources in order to architect a powerful cluster. You can build a decent cluster at a decent price.
Regardless of the scale of your clusters, there’s no need to waste money. By taking a careful look at the menu of selections for instances and storage, and building your network out well, you can be sure that you are getting the most bang for your buck.
One of the first decisions to confront when designing a cluster running in the cloud is which instance type or types to use. Some instance types are too underpowered for most cluster roles, while some are overpowered except for very large-scale deployments. Even ...