Time for action – using the Distributed Cache to improve location output
Let's now use the Distributed Cache to share a list of U.S. state names and abbreviations across the cluster:
- Create a datafile called
states.txt
on the local filesystem. It should have the state abbreviation and full name tab separated, one per line. Or retrieve the file from this book's homepage. The file should start like the following:AL Alabama AK Alaska AZ Arizona AR Arkansas CA California …
- Place the file on HDFS:
$ hadoop fs -put states.txt states.txt
- Copy the previous
UFOLocation.java
file to UFOLocation2.java file and make the changes by adding the following import statements:import java.io.* ; import java.net.* ; import java.util.* ; import org.apache.hadoop.fs.Path; ...
Get Hadoop Beginner's Guide now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.