Chapter 9. Feature Stores
As we previously mentioned, the ETL process pulls data from operational data stores (which power the applications that serve the business), and feeds that data into the analytical data plane. The analytical plane is used to build statistical models that drive insights, which the business then uses to make critical decisions. These decisions are fed back to the operational plane to improve and optimize performance and ultimately increase revenue. One of the principles of a data product in a data mesh is to provide high-quality and trustable data to analytical teams. Quality and trustability help build confidence in analytical outcomes.
Features, or columns, are measurable pieces of data like height, width, age, weight, amount, and price that can be used for analysis. Feature engineering is the process of extracting and preparing data for analytical processing and storing it into a feature store. The feature store serves prepared analytical data to data scientists. Before the advent of feature stores, data scientists and engineers worked together in a very disoriented and disorganized approach when building insights. Often, data was hard to locate and unclean. Its freshness was unknown, its source was often questionable, and its compliance with data governance was unclear. This made insights derived from this data less trustworthy, less certain, and hard to repeat. These are just a few of the issues that manifest when working within a monolithic data lake ...