Architecting Data and Machine Learning Platforms
by Marco Tranquillin, Valliappa Lakshmanan, Firat Tekiner
Chapter 4. A Migration Framework
Unless you are at a startup, it is rare that you will build a data platform from scratch. Instead, you will stand up a new data platform by migrating things into it from legacy systems. In this chapter, let’s examine the process of migration—all the things that you should do when making your journey to a new data platform. We will first present a conceptual model and possible framework you should follow when modernizing the data platform. Then, we will review how an organization can estimate the overall cost of the solution. We will discuss how to ensure that security and data governance are in place even while the migration is going on. Finally, we’ll discuss schema, data, and pipeline migration. You’ll also learn about options for regional capacity, networking, and data transfer constraints.
Modernize Data Workflows
Before you start creating a migration plan, you should have a comprehensive vision of why you are doing it and what you are migrating toward.
Holistic View
Data modernization transformation should be considered holistically. Looking at this from a bird’s-eye perspective, we can identify three main pillars:
- Business outcomes
-
Focus on the workflows that you are modernizing and identify the business outcomes those workflows drive. This is critical to identify where the gaps are and where the opportunities sit. Before making any technology decisions, limit the migration to use cases that align with the business objectives identified ...