What Facebook learned when it opened its data to every employee
Executive reading: Why you need to democratize data.
Data executives might spend most of their time on technical and vendor management, but their work ultimately comes down to the task of building an effective data culture. Reorienting a company around data-driven decision-making takes more than just software tools; it also involves training your employees to understand data essentials, establishing processes that safeguard data and clarify its ownership, working with line-of-business managers to set expectations and goals, and generally striking the right balance between risk-taking and caution.
Since the first Strata conference six years ago, O’Reilly has identified data as an important driver of value in every industry. In this series, we’ll revisit advice to data executives from a handful of authorities on building data culture inside large organizations.
In 2015, Hilary Mason and DJ Patil wrote Data Driven: Creating a Data Culture, to introduce executives to the value of data and to the essential steps that managers need to take in order to exploit that value. In this excerpt from their report, Mason and Patil describe the importance of democratizing data—making sure that employees who might need data have access to it, and making sure they have the resources to interpret it:
The democratization of data is one of the most powerful ideas to come out of data science. Everyone in an organization should have access to as much data as legally possible.
While broad access to data has become more common in the sciences (for example, it is possible to access raw data from the National Weather Service or the National Institutes for Health), Facebook was one of the first companies to give its employees access to data at scale. Early on, Facebook realized that giving everyone access to data was a good thing. Employees didn’t have to put in a request, wait for prioritization, and receive data that might be out of date. This idea was radical because the prevailing belief was that employees wouldn’t know how to access the data, incorrect data would be used to make poor business decisions, and technical costs would become prohibitive. While there were certainly challenges, Facebook found that the benefits far outweighed the costs; it became a more agile company that could develop new products and respond to market changes quickly. Access to data became a critical part of Facebook’s success, and remains something it invests in aggressively.
All of the major web companies soon followed suit. Being able to access data through SQL became a mandatory skill for those in business functions at organizations like Google and LinkedIn. And the wave hasn’t stopped with consumer internet companies. Nonprofits are seeing real benefits from encouraging access to their data—so much so that many are opening their data to the public. They have realized that experts outside of the organization can make important discoveries that might have been otherwise missed. For example, the World Bank now makes its data open so groups of volunteers can come together to clean and interpret it. It’s gotten so much value that it’s gone one step further and has a special site dedicated to public data.
Governments have also begun to recognize the value of democratizing access to data, at both the local and national levels. The U.K. government has been a leader in open data efforts, and the U.S. government created the Open Government Initiative to take advantage of this movement. As the public and the government began to see the value of making data more open, governments began to catalog their data, provide training on how to use the data, and publish data in ways that are compatible with modern technologies. In New York City, access to data led to new Moneyball-like approaches that were more efficient, including finding “a five-fold return on the time of building inspectors looking for illegal apartments” and “an increase in the rate of detection for dangerous buildings that are highly likely to result in firefighter injury or death.” International governments have also followed suit to capitalize on the benefits of opening their data.
One challenge of democratization is helping people find the right data sets and ensuring that the data is clean. As we’ve said many times, 80% of a data scientist’s work is preparing the data, and users without a background in data analysis won’t be prepared to do the cleanup themselves. To help employees make the best use of data, a new role has emerged: the data steward. The steward’s mandate is to ensure consistency and quality of the data by investing in tooling and processes that make the cost of working with data scale logarithmically while the data itself scales exponentially.
To learn more from the people who have built data cultures at leading companies, join us at the Strata Business Summit in London on May 24-25, 2017. The Strata Business Summit’s multi-day lineup includes a series of comprehensive executive briefings, case studies from a wide variety of industries, and deep dives into the managerial and technical subjects that matter most to data executives.