Pentaho Data Integration (PDI) is a great tool to prepare data thanks to its rich data connectors. We will not discuss PDI further here as we already discussed it in the latter part of Chapter 3, Churning Big Data with Pentaho.
Before you proceed to the following examples, complete the steps listed in Appendix B, Hadoop Setup. Note that all the remaining examples work with the
192.168.1.122 IP address configuration at Hortonworks Sandbox VM.
The following steps will help you prepare BI Server to work with Hive:
pentaho-hadoop-shims-api-1.3-SNAPSHOT.jarfiles into the