Apache Spark Deep Learning Cookbook

Book description

A solution-based guide to put your deep learning models into production with the power of Apache Spark

Key Features
  • Discover practical recipes for distributed deep learning with Apache Spark
  • Learn to use libraries such as Keras and TensorFlow
  • Solve problems in order to train your deep learning models on Apache Spark
Book Description

With deep learning gaining rapid mainstream adoption in modern-day industries, organizations are looking for ways to unite popular big data tools with highly efficient deep learning libraries. As a result, this will help deep learning models train with higher efficiency and speed.

With the help of the Apache Spark Deep Learning Cookbook, you'll work through specific recipes to generate outcomes for deep learning algorithms, without getting bogged down in theory. From setting up Apache Spark for deep learning to implementing types of neural net, this book tackles both common and not so common problems to perform deep learning on a distributed environment. In addition to this, you'll get access to deep learning code within Spark that can be reused to answer similar problems or tweaked to answer slightly different problems. You will also learn how to stream and cluster your data with Spark. Once you have got to grips with the basics, you'll explore how to implement and deploy deep learning models, such as Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN) in Spark, using popular libraries such as TensorFlow and Keras.

By the end of the book, you'll have the expertise to train and deploy efficient deep learning models on Apache Spark.

What you will learn
  • Set up a fully functional Spark environment
  • Understand practical machine learning and deep learning concepts
  • Apply built-in machine learning libraries within Spark
  • Explore libraries that are compatible with TensorFlow and Keras
  • Explore NLP models such as Word2vec and TF-IDF on Spark
  • Organize dataframes for deep learning evaluation
  • Apply testing and training modeling to ensure accuracy
  • Access readily available code that may be reusable
Who this book is for

If you're looking for a practical and highly useful resource for implementing efficiently distributed deep learning models with Apache Spark, then the Apache Spark Deep Learning Cookbook is for you. Knowledge of the core machine learning concepts and a basic understanding of the Apache Spark framework is required to get the best out of this book. Additionally, some programming knowledge in Python is a plus.

Table of contents

  1. Title Page
  2. Copyright and Credits
    1. Apache Spark Deep Learning Cookbook
  3. Packt Upsell
    1. Why subscribe?
    2. PacktPub.com
  4. Foreword
  5. Contributors
    1. About the authors
    2. About the reviewers
    3. Packt is searching for authors like you
  6. Preface
    1. Who this book is for
    2. What this book covers
    3. To get the most out of this book
      1. Download the example code files
      2. Conventions used
    4. Sections
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    5. Get in touch
      1. Reviews
  7. Setting Up Spark for Deep Learning Development
    1. Introduction
    2. Downloading an Ubuntu Desktop image
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    3. Installing and configuring Ubuntu with VMWare Fusion on macOS
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    4. Installing and configuring Ubuntu with Oracle VirtualBox on Windows
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    5. Installing and configuring Ubuntu Desktop for Google Cloud Platform
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    6. Installing and configuring Spark and prerequisites on Ubuntu Desktop
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    7. Integrating Jupyter notebooks with Spark
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    8. Starting and configuring a Spark cluster
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    9. Stopping a Spark cluster
      1. How to do it...
      2. How it works...
      3. There's more...
  8. Creating a Neural Network in Spark
    1. Introduction
    2. Creating a dataframe in PySpark
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    3. Manipulating columns in a PySpark dataframe
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
    4. Converting a PySpark dataframe to an array
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    5. Visualizing an array in a scatterplot
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    6. Setting up weights and biases for input into the neural network
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    7. Normalizing the input data for the neural network
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. See also
    8. Validating array for optimal neural network performance
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    9. Setting up the activation function with sigmoid
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. See also
    10. Creating the sigmoid derivative function
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. See also
    11. Calculating the cost function in a neural network
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    12. Predicting gender based on height and weight
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. See also
    13. Visualizing prediction scores
      1. Getting ready
      2. How to do it...
      3. How it works...
  9. Pain Points of Convolutional Neural Networks
    1. Introduction
    2. Pain Point #1: Importing MNIST images
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    3. Pain Point #2: Visualizing MNIST images
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    4. Pain Point #3: Exporting MNIST images as files
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    5. Pain Point #4: Augmenting MNIST images
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    6. Pain Point #5: Utilizing alternate sources for trained images
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    7. Pain Point #6: Prioritizing high-level libraries for CNNs
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
  10. Pain Points of Recurrent Neural Networks
    1. Introduction
    2. Introduction to feedforward networks
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    3. Sequential workings of RNNs
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    4. Pain point #1 – The vanishing gradient problem
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    5. Pain point #2 – The exploding gradient problem
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    6. Sequential working of LSTMs
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
  11. Predicting Fire Department Calls with Spark ML
    1. Introduction
    2. Downloading the San Francisco fire department calls dataset
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    3. Identifying the target variable of the logistic regression model
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    4. Preparing feature variables for the logistic regression model
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    5. Applying the logistic regression model
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    6. Evaluating the accuracy of the logistic regression model
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
  12. Using LSTMs in Generative Networks
    1. Introduction
    2. Downloading novels/books that will be used as input text
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    3. Preparing and cleansing data
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. See also
    4. Tokenizing sentences
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
    5. Training and saving the LSTM model
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    6. Generating similar text using the model
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
  13. Natural Language Processing with TF-IDF
    1. Introduction
    2. Downloading the therapy bot session text dataset
      1. Getting ready
      2. How it works...
      3. How to do it...
      4. There's more...
    3. Analyzing the therapy bot session dataset
      1. Getting ready
      2. How to do it...
      3. How it works...
    4. Visualizing word counts in the dataset
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. See also
    5. Calculating sentiment analysis of text
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. See also
    6. Removing stop words from the text
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. See also
    7. Training the TF-IDF model
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    8. Evaluating TF-IDF model performance
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. See also
    9. Comparing model performance to a baseline score
      1. How to do it...
      2. How it works...
      3. See also
  14. Real Estate Value Prediction Using XGBoost
    1. Downloading the King County House sales dataset
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    2. Performing exploratory analysis and visualization
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    3. Plotting correlation between price and other features
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    4. Predicting the price of a house
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
  15. Predicting Apple Stock Market Cost with LSTM
    1. Downloading stock market data for Apple
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    2. Exploring and visualizing stock market data for Apple
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    3. Preparing stock data for model performance
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    4. Building the LSTM model
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. See also
    5. Evaluating the model
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. See also
  16. Face Recognition Using Deep Convolutional Networks
    1. Introduction
    2. Downloading and loading the MIT-CBCL dataset into the memory
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    3. Plotting and visualizing images from the directory
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    4. Preprocessing images
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    5. Model building, training, and analysis
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
  17. Creating and Visualizing Word Vectors Using Word2Vec
    1. Introduction
    2. Acquiring data
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    3. Importing the necessary libraries
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    4. Preparing the data
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    5. Building and training the model
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    6. Visualizing further 
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. See also
    7. Analyzing further
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. See also
  18. Creating a Movie Recommendation Engine with Keras
    1. Introduction
    2. Downloading MovieLens datasets
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    3. Manipulating and merging the MovieLens datasets
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    4. Exploring the MovieLens datasets
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    5. Preparing dataset for the deep learning pipeline
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    6. Applying the deep learning model with Keras
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    7. Evaluating the recommendation engine's accuracy
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. See also
  19. Image Classification with TensorFlow on Spark
    1. Introduction
    2. Downloading 30 images each of Messi and Ronaldo
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    3. Configuring PySpark installation with deep learning packages
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    4. Loading images on to PySpark dataframes
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    5. Understanding transfer learning
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    6. Creating a pipeline for image classification training
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    7. Evaluating model performance
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    8. Fine-tuning model parameters
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
  20. Other Books You May Enjoy
    1. Leave a review - let other readers know what you think

Product information

  • Title: Apache Spark Deep Learning Cookbook
  • Author(s): Ahmed Sherif, Amrith Ravindra
  • Release date: July 2018
  • Publisher(s): Packt Publishing
  • ISBN: 9781788474221