September 2018
Intermediate to advanced
412 pages
11h 12m
English
Spark is a microbatch-based streaming engine that processes the data in batches. In the following example, it is configured to process the batch data acquired within one second. If the data points are ingested at a rate of 100 ms, then within one second, 10 data points can be accumulated by the time the batch is processed.
Creating a streaming context using spark API:
sc = SparkContext(appName="IIoTSparkStreamingApp")ssc = StreamingContext(sc, Seconds(1))
Configure the Kafka brokers and topic to listen for the input stream:
brokers = “localhost:9092”topic = “test”
Create a direct stream for the input stream data:
message = KafkaUtils.createDirectStream(ssc, [topic], {"metadata.broker.list": brokers})
Transform ...