Book description
Acquire practical skills in Big Data Analytics and explore data science with Apache Mahout
In Detail
In the past few years the generation of data and our capability to store and process it has grown exponentially. There is a need for scalable analytics frameworks and people with the right skills to get the information needed from this Big Data. Apache Mahout is one of the first and most prominent Big Data machine learning platforms. It implements machine learning algorithms on top of distributed processing platforms such as Hadoop and Spark.
Starting with the basics of Mahout and machine learning, you will explore prominent algorithms and their implementation in Mahout development. You will learn about Mahout building blocks, addressing feature extraction, reduction and the curse of dimensionality, delving into classification use cases with the random forest and Naïve Bayes classifier and item and user-based recommendation. You will then work with clustering Mahout using the K-means algorithm and implement Mahout without MapReduce. Finish with a flourish by exploring end-to-end use cases on customer analytics and test analytics to get a real-life practical know-how of analytics projects.
What You Will Learn
- Configure Mahout on Linux systems and set up the development environment
- Become familiar with the Mahout command line utilities and Java APIs
- Understand the core concepts of machine learning and the classes that implement them
- Integrate Apache Mahout with newer platforms such as Apache Spark
- Solve classification, clustering, and recommendation problems with Mahout
- Explore frequent pattern mining and topic modeling, the two main application areas of machine learning
- Understand feature extraction, reduction, and the curse of dimensionality
Table of contents
-
Learning Apache Mahout
- Table of Contents
- Learning Apache Mahout
- Credits
- About the Author
- About the Reviewers
- www.PacktPub.com
- Preface
- 1. Introduction to Mahout
-
2. Core Concepts in Machine Learning
- Supervised learning
- Unsupervised learning
- Recommender system
- Model efficacy
- Summary
- 3. Feature Engineering
- 4. Classification with Mahout
- 5. Frequent Pattern Mining and Topic Modeling
- 6. Recommendation with Mahout
- 7. Clustering with Mahout
- 8. New Paradigm in Mahout
- 9. Case Study – Churn Analytics and Customer Segmentation
- 10. Case Study – Text Analytics
- Index
Product information
- Title: Learning Apache Mahout
- Author(s):
- Release date: March 2015
- Publisher(s): Packt Publishing
- ISBN: 9781783555215
You might also like
book
Learning Apache Spark 2
Learn about the fastest-growing open source project in the world, and find out how it revolutionizes …
video
Introduction to Apache HBase Operations
HBase master Jonathan Hsieh provides a complete overview of Apache HBase operations in this course designed …
book
Mahout in Action
Mahout in Action is a hands-on introduction to machine learning with Apache Mahout. Following real-world examples, …
book
Hadoop with Python
Hadoop is mostly written in Java, but that doesn't exclude the use of other programming languages …