This chapter will show how the Markov model (in its simplest form, known as the Markov chain) can be used to predict the “next smart email marketing date” for customers based on their transaction history. Given a set of random variables (such as customers’ last purchase dates), the Markov model indicates that the distribution for this variable (i.e., the last purchase date) depends only on the distribution of the previous state (another last purchase date). For this “smarter email marketing” problem, two distinct solutions are provided:
A MapReduce/Hadoop solution using the classic
A Spark solution using a directed acyclic graph (an arbitrary set of transformations and actions)
In writing this chapter, I was inspired by Pranab Ghosh’s blog post “Smarter Email Marketing with Markov Model”. For the implementation of the MapReduce phases in this chapter, I developed brand new modular Java classes to demonstrate the core values (such as sorting reduced values through MapReduce’s secondary sort technique, defining a custom partitioner class, and defining a grouping comparator class). Therefore, given a customer’s transaction history, represented by
(purchase-date, amount-purchased), our goal here is to use MapReduce and the Markov model to predict the next effective date to send a marketing email to that customer. This is kind of a machine learning algorithm. Typically, machine learning–based ...