The first step in setting up the environment for our big data use case is to establish a Kafka node. Kafka is essentially a FIFO queue, so we will use the simplest single node (broker) setup. Kafka organizes data using topics, producers, consumers, and brokers.
The important Kafka terminologies are as follows:
- A broker is essentially a node.
- A producer is a process that writes data to the message queue.
- A consumer is a process that reads data from the message queue.
- A topic is the specific queue that we write to and read data from.
A Kafka topic is further subdivided into a number of partitions. We can split data from a particular topic into multiple brokers (nodes), both when we write to the topic and also when we read ...