October 2015
Intermediate to advanced
288 pages
7h 35m
English
In this chapter we cover loading a data file in Cascalog.
In this chapter we assume you have Leiningen set up.
The benefit of this chapter is understanding and applying the concept that Hadoop is a batch processing system. In order to process data, Hadoop must load it first. This chapter explains loading data.
So far we’ve been working with a data structure defined in memory. Now we’ll work with one that is defined in a file.
1. Create a new Leiningen project cascalog-load-file in your projects directory, and change to that directory:
lein new app cascalog-load-file cd cascalog-load-file
2. Put the following in your projects.clj file: ...
Read now
Unlock full access