Skip to Main Content
Big Data Simplified
book

Big Data Simplified

by Sayan Goswami, Amit Kumar Das, Sourabh Mukherjee
June 2019
Beginner to intermediate content levelBeginner to intermediate
360 pages
10h 55m
English
Pearson Education India
Content preview from Big Data Simplified
Introducing Spark andKafka | 139
Let us take a look at the competition. On one side, you have Matlab and R which have the
benefit of being fairly easy to use, but they are less scalable. On the other side, there is Mahout
and GraphLab, which are more scalable but at the cost of ease.
The ML pipelines were officially introduced into the Spark package as an attempt to simplify
machine learning, embracing machine learning’s flow of loading data, extracting features, train-
ing the data and testing that trained data. All through that pipeline, a standard interface allows
tuning, testing and early failure detection.
The ML algorithms help spam filtering, fraud detection or even recommendation analy-
sis. Anabundance of use cases are also at the ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Big Data

Big Data

James Warren, Nathan Marz
Practical Text Mining and Statistical Analysis for Non-structured Text Data Applications

Practical Text Mining and Statistical Analysis for Non-structured Text Data Applications

Gary D. Miner, John Elder, Andrew Fast, Thomas Hill, Robert Nisbet, Dursun Delen
Data Wrangling with Python

Data Wrangling with Python

Jacqueline Kazil, Katharine Jarmul

Publisher Resources

ISBN: 9789353941505