Putting a data file into HDFS
The previous example shows how PDI interacts with Hive using a SQL-like expression.
Now let's work with the framework filesystem, HDFS. We will copy a CSV text file into an HDFS folder. Follow these steps:
- Download a compressed CSV sample file from http://goo.gl/EdJwk5.
- Create a new job from Spoon.
- Put data in the workspace and create a flow between the following steps:
- From the General grouping, choose START
- From the Big Data grouping, choose Hadoop Copy Files
- Double-click on Hadoop Copy Files. The step's editor dialog will appear.
- Click on the Browse button next to the File/Folder textbox. The Open File dialog appears; choose ...