The idea of this chapter is to illustrate a typical web analytics solution, a problem that is often solved using a Hadoop batch job. Unlike a Hadoop implementation, a Storm-based solution will show results that are refreshed in real time.
Our example has three main components (see Figure 6-1):
A Node.js web application, to test the system
A Redis server, to persist the data
A Storm topology, for real-time distributed data processing
Figure 6-1. Architecture overview
If you want to go through this chapter while playing with the example, you should first read Appendix C.
We have mocked up a simple e-commerce website with three pages: a home page, a product page, and a product statistics page. This application is implemented using the Express Framework and Socket.io Framework to push updates to the browser. The idea of the application is to let you play with the cluster and see the results, but it’s not the focus of this book, so we won’t go into any more detail than a description of the pages it has.
Figure 6-2. Home page
The Product Page shows information related to a specific ...