When we break down the requirements of transactional or operational fast data applications, we see four different activities that need to occur in a real-time, event-oriented fashion. As data is originated, it is analyzed for context and presented to applications that have business-impacting side effects, and then captured to long-term storage. We describe this flow as ingest, analyze, decide, and export.
You have to be able to scale to the ingest rates of very fast incoming feeds of data—perhaps log data or sensor data, perhaps interaction data that’s being generated by a large SaaS platform or maybe real-time metering data from a smart grid network. You need to be able to process hundreds of thousands or sometimes even millions of events per second in an event-oriented streaming and operational fashion before that data is recorded forever into a big data warehouse for future exploration and analytics.
You might want to look to see if the event triggers a policy execution or perhaps qualifies a user for an up-sell or offering campaign. These are all transactions that need to occur against the event feed in realtime. In order to make these decisions, you need to be able to combine analytics derived from the big data repository with the context in the real-time analytics generated out of the incoming stream of data.
As this data is received, you need to be able to make decisions against it: to support applications that process these events ...