Agile Data Mastering
Introduction
Organizations across all industries are attempting to capitalize on the promise of Big Data by using their information assets as a source of competitive advantage. In doing so, they are investing heavily in areas such as analytic tools and new storage capabilities. However, they often neglect the data management layer of the equation: it’s not simply about finding an optimal way to store or analyze the data; it’s also vital to prepare and manage the data for consumption. After all, if the data is inaccurate or incomplete, it can undermine confidence of the people who rely on that data and lead to poor decision-making. Whether the organizations are processing customer records, supplier records, or information about other entities, they are dealing with datasets containing errors and duplicates. Typically, organizations expend a lot of time on manual data cleaning and vetting to create master records—a single, trusted view of an organizational entity such as a customer or supplier—and this is often the area where most help is needed.
Machine learning can be immensely powerful in the creation of the master data record. This report will describe the importance of the master record to an organization, discuss the different methods for creating a master data record, and articulate the significant benefits of applying machine learning to the data mastering process, ultimately creating more complete and accurate master records in a fraction of the time ...
Get Agile Data Mastering now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.