Chapter 4. Getting Started with External Data Today
As the COVID-19 pandemic grew worse, the rapid changes in the global economy and seismic shifts in consumer behaviors rendered thousands of datasets and analytical models useless overnight. Organizations worked tirelessly to find and source more relevant public and third-party datasets.
For example, a retail organization leveraged external data sources to enhance its workforce-availability analysis and contingency planning. It analyzed epidemiological model predictions and location-specific information—such as whether employees would likely commute to work via city buses, passenger trains, or subways for each zip code where it operates—in conjunction with its internal workforce data on employee segments.
Regional managers were now empowered to more accurately anticipate when and where they’d need to adjust their workforce plans (for example, hire or move associates) and institute contingency measures, such as shortening store hours.
Some people may read this and think, “Sure, that’s fine during COVID, but COVID is an extreme anomaly. Do we still need so much access to external data once the pandemic ends?” The reality is, when you are trying to solve almost any analytical problem, there is value—and a significant cost—in adding more layers of external data sources.
For example, if you’re trying to predict volume sales for a store and the road the store is on was closed for two to three months, or if a competing shop opened up next door—these crucial factors might be missed by only analyzing the internal data.
Other organizations might put too much emphasis on seasonality and historical performance, which can lead to botched targets or famished or bloated inventory predictions. Poor forecasts can also have dire consequences. Failing to properly identify competition, new retail channels, or a decrease in new stores for distribution can not only hurt your sales but could leave you with more inventory and less income than you expected—or the embarrassment of out-of-stock items.
In this chapter, you’ll see how to get the ball rolling with the right external data team and corresponding mindset. Next, you’ll delve into Explorium, an automated external data management platform and learn how the platform’s unique all-in-one platform gives data scientists, analysts, and business leaders access to all relevant external data signals needed to drive decision making. Finally, you’ll see how real customers such as Melio and GlassesUSA.com are using Explorium’s unique data enrichments to help them scale up rapidly and instantly prioritize their right potential customers.
Kick Off Your External Data Strategy by Choosing the Right Team
If you want your organization to leverage external data efficaciously at scale, you’ll need the right team, the right mindset and strategy, and the best tools. Technology can solve problems and streamline pipelines, but without the right team working together and executing a careful strategy, most projects are likely to fail or never get off the ground in the first place. Everything in business starts with having the right people, so putting together a committed data-sourcing team is paramount.
A recent McKinsey article suggests setting the foundation with a dedicated data scout. A data scout or data strategist must identify operational, cost, and growth improvements that could be powered by external data by partnering with the data-analytics team and business functions. A good data scout should also build enthusiasm for the opportunities that external data offers, pinpoint the best use cases, find the best data sources, and demonstrate the value generation of said data. The data scout should be the quarterback or field general of your external data team—so please choose wisely.
McKinsey recommends other roles that should be drawn from across functions, including:
-
Purchasing experts
-
Data reviewers
-
Architects and DevOps engineers
-
Data engineers
-
Data scientists and analysts
Purchasing experts steer the ship through murky legal waters and may also be able to connect technology vendors and external data providers, while data reviewers maintain proper compliance with data privacy and other rules and regulations.
Data architects and DevOps engineers, meanwhile, cultivate the relevant infrastructure to support and streamline the use of external data, make sure it’s integrated with internal data sources, and manage access to data.
Data engineers must cooperate with both the data science teams and the line of business stakeholders by assisting in the evaluation process of the external data and preparing the data for the data science and analytics teams. Finally, data scientists and analysts apply the external data to their analysis and use cases and quantify the benefits and upgrades in model performance, thanks to external data.
Start with the Low-Hanging Fruit, but Beware of Shiny-New-Toy Syndrome
Now that you have the right team in place, it’s time for your external data team to get started. Many teams are often quite enthusiastic, but their zeal can lead them into three common pitfalls:
-
Trying to leverage dozens of external data sources without a plan or identified use cases
-
Starting with the most difficult use cases
-
Diving straight into the implementation of the latest technology without fully considering its suitability for your business, people, or society, or what is known as “shiny-new-toy syndrome”
It’s easy to be extremely ambitious and enthusiastic when starting out with external data, and often teams fall into the trap of attempting to leverage a plethora of external data sources at one time without a definitive plan or earmarked use cases.
This common misstep is understandable but may cost you valuable time, money, and resources. Other teams may try to bite off more than they can chew and tackle the most difficult use cases first. Simply reviewing and prioritizing the lowest-effort use cases with the highest positive impact from external data is a more prudent and effective approach.
Racking up a few early wins will instill confidence and earn your team credibility. Your empowered team can then develop a successful pattern that will help them take on more challenging use cases with a higher impact. These more challenging cases will naturally require more effort, but your team will have already been conditioned after picking the low-hanging fruit several times.
Another common trap is shiny-new-toy syndrome. Your team should first identify a clear business problem—augmenting decision making, meeting customer needs better, predicting supply and demand, etc.—and then use the technology to solve the problem.
Start with your North Star and then work backward. It can be fun to buy a new tool to fix up your house, but you don’t want to start drilling holes into the walls haphazardly. Get a good builder, identify the problem, get the best tools, and get to work.
Now that you know who needs to be part of the team and have defined the business value and the right use cases, it’s time to move onto choosing the right tool—the best external data platform.
Discover Relevant Datasets and Add Significant Value
Melio’s mission is straightforward: keep small businesses in business by providing a smart, simple B2B payments solution tailor-made for their needs, such as conveniently paying bills online via bank transfer, debit card, or credit card. The company has been growing rapidly since its launch in May 2019. Monthly active users (MAU) grew by over 2,000% in 2020, and the platform soon faced the challenge of processing billions of dollars in payment volume. The Melio team, however, lacked the capacity to analyze the surge in inbound leads and to identify and authenticate the small and medium-sized businesses (SMBs) that were relevant business prospects.
Melio’s marketing team knew they needed a better system to prioritize inbound leads and to validate their eligibility and fit for Melio’s products and services. They needed a fast, automated process to identify and prioritize the right business segments and take their mission to scale.
Melio tapped Explorium to support its ambition for hypergrowth by implementing Explorium’s external data platform to discover relevant datasets that added significant value. In the end, the partnership resulted in models that broadened lead scoring criteria by using a combination of internal data and external enrichment tools that made a significant impact.
The Melio marketing team is now able to analyze their marketing funnel and identify relevant leads to better focus marketing and sales resources on the most relevant, high-value segments. Melio has been able to achieve the following tangible outcomes:
-
Data-driven decisions are based on model outputs, leading to a 15% increase in conversion rates.
-
Considerable savings are delivered in the time previously allotted to funnel analysis, which enabled its operations team to achieve a three-fold improvement in efficiency.
-
More core tasks are addressed with the same resources.
Explorium’s all-in-one platform helped Melio by automatically matching external data with internal enterprise data to uncover thousands of signals to improve ML models and business outcomes.
Other organizations, meanwhile, need help optimizing their digital campaigns. When GlassesUSA.com, the fastest-growing, leading online eyewear retailer, was looking to enhance its Facebook retargeting campaign by pinpointing users with a high likelihood to buy eyewear, they implemented an in-house predictive model they built using the Explorium External Data platform.
GlassesUSA.com built and trained an effective predict-to-buy model with Explorium’s automatic machine learning capabilities that assigned a score to website visitors—predicting their likelihood to purchase. Users were bracketed into groups according to their likelihood to buy, and then custom audiences were created on Facebook for each bucket.
After implementing the “predict to buy” model, GlassesUSA.com has seen significant, tangible improvement in the performance of its Facebook dynamic ads campaigns, including:
-
A 10% improvement in ROAS (return on ad spend)
-
A 10% increase in conversion rates reflected in an increase in the “add-to-cart” ratio
-
Enhanced operation and marketing efficiency
Conclusion
From COVID anomalies to footfall traffic to standard marketing, promotion, and advertising campaigns, organizations that can create better infrastructure to collect, store, analyze, and leverage external data—and successfully integrate it into their operations with their internal data—can outperform other companies by unlocking improvements in growth, productivity, and risk management.
Whether you’re looking to mitigate risk, predict customer lifetime value, forecast demand, optimize ad spend, or create custom experiences, Explorium can help get the wheels turning by:
-
Helping to define your goals
-
Enriching your data with augmented data discovery
-
Getting your optimal feature set
-
Selecting and deploying your model
-
Generating valuable insights
Computer scientist Peter Norvig has been quoted many times as saying, “More data beats clever algorithms, but better data beats more data.”
Get Why External Data Needs to Be Part of Your Data and Analytics Strategy now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.