In this episode of the O’Reilly Data Show, O’Reilly’s online managing editor Jenn Webb speaks with Natalino Busa on the topic of predictive analytics, the challenges of feature engineering, and a new class of techniques that is enabling features to emerge from patterns within the data. They also discuss the relationship between predictive techniques and high-quality microservices, and how machine learning is being used to improve financial services.
Below are some highlights from their conversation:
Evolution of feature engineering
Predictive analytics has been evolving in the past 50 years, from a more traditional statistical methodology based on very well-known techniques, into a form which embeds new algorithms, new techniques, and new methods. In general, we see that predictive analytics is based on things like extracting and crafting features. In a way, features are the essence of what today we experience as predictive analytics. Because predictive analytics is, in fact, nothing else than defining a number of variables that define a given problem, a given data set.
… In the last 10 years, most of the so-called predictive analytics problem was centered around the idea of extracting and defining good features. Those features were most of the time handcrafted. … However, what we see today is that handcrafting, or coming with a set of good features, is still a very hard problem. Not all data science teams have the background, the capacity, or the knowledge to cover this wide set of transformations, which brings a good set of features. The other problem is that feature engineering, in some cases, is very hard to do by simply by looking at the data.
Microservices that serve a purpose
Microservices—when done properly—obey the idea that they provide a solution to one problem, and one problem only. When you have this sort of separation of concern and this sort of unity within a microservice, it serves a good purpose. The purpose is the idea of exposing a technique, an algorithm, by an interface.
… Now, what we see today…are more data set-oriented sorts of APIs, that are augmented by a new layer of APIs, which doesn't actually deal with the raw data, but deals with insights. ... This is microservices applied as a way of applying predictive techniques, into an engineering platform...to expose higher level services, such as recommendations or, for instance, anomaly detection. ... For instance, product-to-product comparisons or product-to-product recommendations, and so forth.
Applications in financial services
There are a number of financial APIs and services that are based on machine learning. Some are meant to speed up the customer journey and the financial service scrutiny process. Processes which would require days are brought down to minutes. This is possible because the models and the risk calculations involved are based on big data and machine learning algorithms, rather than only on advisor or expert resources.
Others are meant to simplify and streamline our lives, for instance by providing a better overview on how and when we spend. These predictive techniques can potentially relieve us from the task of remembering when a payment is due, and providing an indication of the 'free to spend' money each month. ING, for example (a company where I worked in the past) has recently released a new feature in their mobile app about predicting recurring payments. These are just a few examples of machine learning applied to financial services. I am sure that we will see more of these data-driven tools in finance in the coming months.
- Mastering Feature Engineering, by Alice Zheng