O'Reilly logo

Apache Spark for Data Science Cookbook by Padma Priya Chitturi

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Anaconda for cluster management

Anaconda for cluster management provides resource management tools which allow users to easily create, provision and manage bare-metal or cloud-based clusters. It enables the management of conda environments on clusters and provides integration, configuration and setup management of Hadoop services. This can be installed alongside enterprise Hadoop distributions such as Cloudera CDH or Hortonworks HDP and this is used to manage conda packages and environments across a cluster.

Getting ready

To step through this recipe, you need Ubuntu 14.04 (Linux flavor) installed on the machine. Python comes pre-installed. python --version gives the version of Python installed. If the version is 2.6.x, upgrade it to Python 2.7 as ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required