O'Reilly logo

Pentaho for Big Data Analytics by Feris Thia, Manoj R Patil

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Preparing data

Pentaho Data Integration (PDI) is a great tool to prepare data thanks to its rich data connectors. We will not discuss PDI further here as we already discussed it in the latter part of Chapter 3, Churning Big Data with Pentaho.

Preparing BI Server to work with Hive

Before you proceed to the following examples, complete the steps listed in Appendix B, Hadoop Setup. Note that all the remaining examples work with the 192.168.1.122 IP address configuration at Hortonworks Sandbox VM.

The following steps will help you prepare BI Server to work with Hive:

  1. Copy the pentaho-hadoop-hive-jdbc-shim-1.3-SNAPSHOT.jar and pentaho-hadoop-shims-api-1.3-SNAPSHOT.jar files into the [BISERVER]/administration-console/jdbc and [BISERVER]/biserver-ce/tomcat/lib ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required