Time for action – writing network traffic onto HDFS
This discussion of Flume in a book about Hadoop hasn't actually used Hadoop at all so far. Let's remedy that by writing data onto HDFS via Flume.
- Create the following file as
agent4.conf
within the Flume working directory:agent4.sources = netsource agent4.sinks = hdfssink agent4.channels = memorychannel agent4.sources.netsource.type = netcat agent4.sources.netsource.bind = localhost agent4.sources.netsource.port = 3000 agent4.sinks.hdfssink.type = hdfs agent4.sinks.hdfssink.hdfs.path = /flume agent4.sinks.hdfssink.hdfs.filePrefix = log agent4.sinks.hdfssink.hdfs.rollInterval = 0 agent4.sinks.hdfssink.hdfs.rollCount = 3 agent4.sinks.hdfssink.hdfs.fileType = DataStream agent4.channels.memorychannel.type ...
Get Hadoop: Data Processing and Modelling now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.