Video description
Machine-learning expert Mikio Braun moves budding data scientists into the world of big data with this overview of how to do complex data analysis at scale. You'll learn the general concepts behind machine learning, compare small scale and large scale data analysis algorithms, and review the basics of the architectures used in large-scale distributed processing. You'll then explore the use of Spark programming for data flow systems,and the many uses of approximation. Braun also outlines evaluation, feature extraction, and model-selection computing costs in big data analysis. The video closes with a discussion of the relationship between the amount of available data and the complexity of the learning problem.
- Review machine learning concepts such as fitting a model to data
- Learn core concepts behind large scale algorithms like stochastic gradient descent
- Review the architectures used in Hadoop-based systems and data flow systems
- Explore resilient distributed dataset structures, vectors, and matrices using Spark
- Review Sparks’s machine libraries and how to run basic machine learning tasks
- Understand the use of approximation in optimization and compressing feature spaces
- Learn what makes data “complex”
Mikio Braun is a data scientist researcher, a start-up entrepreneur, and the on-going creator of jblas, the open source library for fast linear algebra in Java. He has a Ph.D. in Computer Science, and works at Zalando.
Publisher resources
Table of contents
-
Part 1: Introduction
- Introduction to Scalable Machine Learning 00:11:13
- Some Machine Learning Background 00:12:29
- Algorithms for Large Scale Learning 00:20:10
- Part 2: Hadoop And Friends
-
Part 3: Programming for Data Flow Systems
- How Programming for Data Flow Differs 00:16:11
- Basic Spark 00:19:13
- Working with Vectors and Matrices in Spark 00:34:53
- A Brief Tour of Spark ML 00:29:40
-
Part 4: Beyond Paralleization
- Approximation is the Key 00:15:34
-
Part 5: Practical Big Data
- Practical Big Data 00:06:57
- Size vs. Complexity 00:05:06
- Summary 00:02:53
Product information
- Title: Scalable Machine Learning
- Author(s):
- Release date: December 2015
- Publisher(s): O'Reilly Media, Inc.
- ISBN: 9781491939437
You might also like
book
Machine Learning Pocket Reference
With detailed notes, tables, and examples, this handy reference will help you navigate the basics of …
book
Software Engineering at Google
Today, software engineers need to know not only how to program effectively but also how to …
book
Data Science from Scratch, 2nd Edition
To really learn data science, you should not only master the tools—data science libraries, frameworks, modules, …
video
Introduction to Python
Intrigued by Python? Learn how to get started with this popular language, whether you’re new to …