Splitting the Apache log line

Now, we are creating a new topology, which will read the data from Kafka using the KafkaSpout spout. In this section, we are writing an ApacheLogSplitter bolt, that has a logic to fetch the IP, status code, referrer, bytes sent, and so on, information from the Apache log line. As this is a new topology, we must first create the new project.

  1. Create a new Maven project with groupId as com.stormadvance and artifactId as logprocessing.
  2. Add the following dependencies in the pom.xml file:
 <dependency> <groupId>org.apache.storm</groupId> <artifactId>storm-core</artifactId> <version>1.0.2</version> <scope>provided</scope> </dependency> <!-- Utilities --> <dependency> <groupId>commons-collections</groupId> <artifactId>commons-collections</artifactId> ...

Get Mastering Apache Storm now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.