O'Reilly logo

Pentaho for Big Data Analytics by Feris Thia, Manoj R Patil

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 3. Churning Big Data with Pentaho

This chapter provides a basic understanding of the Big Data ecosystem and an example to analyze data sitting on the Hadoop framework using Pentaho. At the end of this chapter, you will learn how to translate diverse data sets into meaningful data sets using Hadoop/Hive.

In this chapter we will cover the following topics:

  • Overview of Big Data and Hadoop
  • Hadoop architecture
  • Big Data capabilities of Pentaho Data Integration (PDI)
  • Working with PDI and Hortonworks Data Platform, a Hadoop distribution
  • Loading data from HDFS to Hive using PDI
  • Query data using Hive's SQL-like language

An overview of Big Data and Hadoop

Today, Big Data (http://en.wikipedia.org/wiki/Big_data) and Hadoop (http://hadoop.apache.org) have almost ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required