Appendix O. Generating streaming data

This appendix explains how to use the streaming data generator introduced in chapter 10.

O.1 Need for generating streaming data

By definition, streaming data cannot be found in a file. Streaming data can be dropped as a file into a directory or via a network. You need to generate streaming data.

To illustrate streaming data, I built a small generator API, which you need to call in your application.

Lab Examples from this appendix are available in GitHub at https://github.com/jgperrin/net.jgp.books.spark.ch10 .

O.2 A simple stream

The following application will generate up to 10 records at a time, in an interval between 2.5 seconds and 7.5 seconds, for 60 seconds. The record will simulate a person with ...

Get Spark in Action, Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.