O'Reilly logo

Learning AWS - Second Edition by Amit Shah, Aurobindo Sarkar

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Using Amazon EC2 Spot and Auto Scaling

Using spot instances is a great way to save money compared to using on-demand instances. However, be more careful in a SLA-driven environment (where you cannot withstand any failures). In most cases, the odds of a failure are pretty low, and Hadoop itself can handle several node failures so that even if some nodes are taken away you may still be good—you should run task nodes that don’t have data so as to not impact HDFS. But still, there might be failures and you could lose a bunch of nodes and Spark may not be able to re-compute a DataFrame. Having the logic to just kill the cluster and create a new one on-demand for that one job is more expensive but over a period of time, the savings still make it ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required