A Guide to Improving Data Integrity and Adoption

Book description

For most companies, quality data is key to measuring success and planning for business goals. But achieving data accuracy and integrity can be a daunting task given the messy nature of data in the wild. How can you trust that source data is accurate? What data should be excluded as invalid? What steps can you take to ensure that all the data is transformed correctly? How do you know if your conclusions are accurate?

This report presents a case study from a large and critical data project at Spiceworks, the vibrant network, online community, and marketplace for IT professionals. Author Jessica Roper, a senior developer in Spiceworks’ data analytics division, demonstrates ways to think about data verification, processing, analysis, and automation. You’ll also get a guide to tools for determining whether the data you collect and use is reliable and accurate.

  • Understand what’s involved in vetting data for trustworthiness
  • Learn strategies and test cases for verifying raw data sources and working with transformations
  • Become familiar with the data at each layer and create tests between each transformation to ensure consistency
  • Understand which edge cases to look for, and what trends and outliers to expect
  • Depend on data monitors to identify anomalies and system issues
  • Automate process and acceptance tests to monitor and ensure reliability
  • Work with other teams and groups to improve and validate data accuracy
  • Increase adoption by using data to measure success

Publisher resources

View/Submit Errata

Product information

  • Title: A Guide to Improving Data Integrity and Adoption
  • Author(s): Jessica Roper
  • Release date: December 2016
  • Publisher(s): O'Reilly Media, Inc.
  • ISBN: 9781491970515