March 2019
Beginner to intermediate
182 pages
4h 6m
English
In this section, we will be answering the following questions:
MLlib is the machine learning library that comes with Spark. There has been a recent new development that allows us to use Spark's data-processing capabilities to pipe into machine learning capabilities native to Spark. This means that we can use Spark not only to ingest, collect, and transform data, but we can also analyze and use it to build machine learning models on the PySpark platform, which allows us to have a more seamless deployable solution.
Summary statistics are a very simple concept. We are familiar with average, or standard deviation, or the ...
Read now
Unlock full access