Book description
Combine advanced analytics including Machine Learning, Deep Learning Neural Networks and Natural Language Processing with modern scalable technologies including Apache Spark to derive actionable insights from Big Data in real-time
Key Features
- Make a hands-on start in the fields of Big Data, Distributed Technologies and Machine Learning
- Learn how to design, develop and interpret the results of common Machine Learning algorithms
- Uncover hidden patterns in your data in order to derive real actionable insights and business value
Book Description
Every person and every organization in the world manages data, whether they realize it or not. Data is used to describe the world around us and can be used for almost any purpose, from analyzing consumer habits to fighting disease and serious organized crime. Ultimately, we manage data in order to derive value from it, and many organizations around the world have traditionally invested in technology to help process their data faster and more efficiently.
But we now live in an interconnected world driven by mass data creation and consumption where data is no longer rows and columns restricted to a spreadsheet, but an organic and evolving asset in its own right. With this realization comes major challenges for organizations: how do we manage the sheer size of data being created every second (think not only spreadsheets and databases, but also social media posts, images, videos, music, blogs and so on)? And once we can manage all of this data, how do we derive real value from it?
The focus of Machine Learning with Apache Spark is to help us answer these questions in a hands-on manner. We introduce the latest scalable technologies to help us manage and process big data. We then introduce advanced analytical algorithms applied to real-world use cases in order to uncover patterns, derive actionable insights, and learn from this big data.
What you will learn
- Understand how Spark fits in the context of the big data ecosystem
- Understand how to deploy and configure a local development environment using Apache Spark
- Understand how to design supervised and unsupervised learning models
- Build models to perform NLP, deep learning, and cognitive services using Spark ML libraries
- Design real-time machine learning pipelines in Apache Spark
- Become familiar with advanced techniques for processing a large volume of data by applying machine learning algorithms
Who this book is for
This book is aimed at Business Analysts, Data Analysts and Data Scientists who wish to make a hands-on start in order to take advantage of modern Big Data technologies combined with Advanced Analytics.
Table of contents
- Title Page
- Copyright and Credits
- Dedication
- About Packt
- Contributors
- Preface
-
The Big Data Ecosystem
- A brief history of data
-
Big data ecosystem
- Horizontal scaling
- Distributed systems
- Artificial intelligence and machine learning
- Cloud computing platforms
- Data insights platform
- Summary
- Setting Up a Local Development Environment
- Artificial Intelligence and Machine Learning
- Supervised Learning Using Apache Spark
- Unsupervised Learning Using Apache Spark
- Natural Language Processing Using Apache Spark
- Deep Learning Using Apache Spark
- Real-Time Machine Learning Using Apache Spark
- Other Books You May Enjoy
Product information
- Title: Machine Learning with Apache Spark Quick Start Guide
- Author(s):
- Release date: December 2018
- Publisher(s): Packt Publishing
- ISBN: 9781789346565
You might also like
book
Hands-On Deep Learning with Apache Spark
Speed up the design and implementation of deep learning solutions using Apache Spark Key Features Explore …
book
Apache Spark Quick Start Guide
A practical guide for solving complex data processing challenges by applying the best optimizations techniques in …
book
Next-Generation Machine Learning with Spark: Covers XGBoost, LightGBM, Spark NLP, Distributed Deep Learning with Keras, and More
Access real-world documentation and examples for the Spark platform for building large-scale, enterprise-grade machine learning applications. …
video
Apache Spark 3 for Data Engineering and Analytics with Python
Apache Spark 3 is an open-source distributed engine for querying and processing data. This course will …