Hands-On Deep Learning with Apache Spark

Book description

Speed up the design and implementation of deep learning solutions using Apache Spark

Key Features

  • Explore the world of distributed deep learning with Apache Spark
  • Train neural networks with deep learning libraries such as BigDL and TensorFlow
  • Develop Spark deep learning applications to intelligently handle large and complex datasets

Book Description

Deep learning is a subset of machine learning where datasets with several layers of complexity can be processed. Hands-On Deep Learning with Apache Spark addresses the sheer complexity of technical and analytical parts and the speed at which deep learning solutions can be implemented on Apache Spark.

The book starts with the fundamentals of Apache Spark and deep learning. You will set up Spark for deep learning, learn principles of distributed modeling, and understand different types of neural nets. You will then implement deep learning models, such as convolutional neural networks (CNNs), recurrent neural networks (RNNs), and long short-term memory (LSTM) on Spark.

As you progress through the book, you will gain hands-on experience of what it takes to understand the complex datasets you are dealing with. During the course of this book, you will use popular deep learning frameworks, such as TensorFlow, Deeplearning4j, and Keras to train your distributed models.

By the end of this book, you'll have gained experience with the implementation of your models on a variety of use cases.

What you will learn

  • Understand the basics of deep learning
  • Set up Apache Spark for deep learning
  • Understand the principles of distribution modeling and different types of neural networks
  • Obtain an understanding of deep learning algorithms
  • Discover textual analysis and deep learning with Spark
  • Use popular deep learning frameworks, such as Deeplearning4j, TensorFlow, and Keras
  • Explore popular deep learning algorithms

Who this book is for

If you are a Scala developer, data scientist, or data analyst who wants to learn how to use Spark for implementing efficient deep learning models, Hands-On Deep Learning with Apache Spark is for you. Knowledge of the core machine learning concepts and some exposure to Spark will be helpful.

Table of contents

  1. Title Page
  2. Copyright and Credits
    1. Hands-On Deep Learning with Apache Spark
  3. About Packt
    1. Why subscribe?
    2. Packt.com
  4. Contributors
    1. About the author
    2. About the reviewer
    3. Packt is searching for authors like you
  5. Preface
    1. Who this book is for
    2. What this book covers
    3. To get the most out of this book
      1. Download the example code files
      2. Download the color images
      3. Conventions used
    4. Get in touch
      1. Reviews
  6. The Apache Spark Ecosystem
    1. Apache Spark fundamentals
    2. Getting Spark
    3. RDD programming
    4. Spark SQL, Datasets, and DataFrames
    5. Spark Streaming
    6. Cluster mode using different managers
      1. Standalone mode
      2. Mesos cluster mode
      3. YARN cluster mode
        1. Submitting Spark applications on YARN
      4. Kubernetes cluster mode
    7. Summary
  7. Deep Learning Basics
    1. Introducing DL
    2. DNNs overview
      1. CNNs
      2. RNNs
    3. Practical applications of DL
    4. Summary
  8. Extract, Transform, Load
    1. Training data ingestion through Spark
      1. The DeepLearning4j framework
      2. Data ingestion through DataVec and transformation through Spark
    2. Training data ingestion from a database with Spark
      1. Data ingestion from a relational database
      2. Data ingestion from a NoSQL database
    3. Data ingestion from S3
    4. Raw data transformation with Spark
    5. Summary
  9. Streaming
    1. Streaming data with Apache Spark
    2. Streaming data with Kafka and Spark
      1. Apache Kakfa
      2. Spark Streaming and Kafka
    3. Streaming data with DL4J and Spark
    4. Summary
  10. Convolutional Neural Networks
    1. Convolutional layers
    2. Pooling layers
    3. Fully connected layers
    4. Weights
    5. GoogleNet Inception V3 model
    6. Hands-on CNN with Spark
    7. Summary
  11. Recurrent Neural Networks
    1. LSTM
      1. Backpropagation Through Time (BPTT)
      2. RNN issues
    2. Use cases
    3. Hands-on RNNs with Spark
      1. RNNs with DL4J
      2. RNNs with DL4J and Spark
      3. Loading multiple CSVs for RNN data pipelines
    4. Summary
  12. Training Neural Networks with Spark
    1. Distributed network training with Spark and DeepLearning4j
      1. CNN distributed training with Spark and DL4J
      2. RNN distributed training with Spark and DL4J
      3. Performance considerations
    2. Hyperparameter optimization
      1. The Arbiter UI
    3. Summary
  13. Monitoring and Debugging Neural Network Training
    1. Monitoring and debugging neural networks during their training phases
      1. 8.1.1 The DL4J training UI
      2. 8.1.2 The DL4J training UI and Spark
      3. 8.1.3 Using visualization to tune a network
    2. Summary
  14. Interpreting Neural Network Output
    1. Evaluation techniques with DL4J
      1. Evaluation for classification
      2. Evaluation for classification – Spark example
    2. Other types of evaluation
    3. Summary
  15. Deploying on a Distributed System
    1. Setup of a distributed environment with DeepLearning4j
      1. Memory management
      2. CPU and GPU setup
      3. Building a job to be submitted to Spark for training
    2. Spark distributed training architecture details
      1. Model parallelism and data parallelism
      2. Parameter averaging
      3. Asynchronous stochastic gradient sharing
    3. Importing Python models into the JVM with DL4J
    4. Alternatives to DL4J for the Scala programming language
      1. BigDL
      2. DeepLearning.scala
    5. Summary
  16. NLP Basics
    1. NLP
      1. Tokenizers
      2. Sentence segmentation
      3. POS tagging
      4. Named entity extraction (NER)
      5. Chunking
      6. Parsing
    2. Hands-on NLP with Spark
      1. Hands-on NLP with Spark and Stanford core NLP
      2. Hands-on NLP with Spark NLP
    3. Summary
  17. Textual Analysis and Deep Learning
    1. Hands-on NLP with DL4J
    2. Hands-on NLP with TensorFlow
    3. Hand-on NLP with Keras and a TensorFlow backend
    4. Hands-on NLP with Keras model import into DL4J
    5. Summary
  18. Convolution
    1. Convolution
    2. Object recognition strategies
    3. Convolution applied to image recognition
      1. Keras implementation
      2. DL4J implementation
    4. Summary
  19. Image Classification
    1. Implementing an end-to-end image classification web application
      1. Picking up a proper Keras model
      2. Importing and testing the model in DL4J
      3. Re-training the model in Apache Spark
      4. Implementing the web application
      5. Implementing a web service
    2. Summary
  20. What's Next for Deep Learning?
    1. What to expect next for deep learning and AI
    2. Topics to watch for
    3. Is Spark ready for RL?
    4. DeepLearning4J future support for GANs
    5. Summary
  21. Appendix A: Functional Programming in Scala
    1. Functional programming (FP)
      1. Purity
      2. Recursion
  22. Appendix B: Image Data Preparation for Spark
    1. Image preprocessing
      1. Strategies
      2. Training
  23. Other Books You May Enjoy
    1. Leave a review - let other readers know what you think

Product information

  • Title: Hands-On Deep Learning with Apache Spark
  • Author(s): Guglielmo Iozzia
  • Release date: January 2019
  • Publisher(s): Packt Publishing
  • ISBN: 9781788994613