Exclusive Preview: Chapter 2 of the Anomalo + O’Reilly Book on Automating Data Quality Monitoring at Scale

April 20, 2023

We’re thrilled to be working with O’Reilly on a book to help organizations discover new solutions for detecting and resolving data quality issues. 

After releasing Chapter 1 last month, today, we’re making Chapter 2 available for free. Chapter 2, “Data Quality Monitoring Strategies and the Role of Automation,” is a comprehensive look at the historical approaches to data quality, and how the rise of machine learning is leading to exciting new ways to monitor data at scale. 

Click here to download Chapters 1 and 2 for free (http://anomalo.com/oreilly/automating-data-quality-monitoring-at-scale)

The full book will be published by O’Reilly Media later this year. We’re giving away more chapters in the coming months, so follow us on our blog and LinkedIn for updates!

About the book

Automating Data Quality Monitoring at Scale is based on everything we’ve learned from building Anomalo. Chapter 1, “The Data Quality Imperative,” sets the stage by explaining why a business should care about data quality today. In Chapter 2, we discuss why existing strategies don’t scale to large amounts of data, and propose a new approach. You’ll learn:

  • What data quality monitoring should accomplish, and why it must go beyond basic observability
  • The pros and cons of the three main historical approaches: 
    - Manual checks
    - Rule-based testing
    - Metrics monitoring
  • How machine learning can automate data quality monitoring while reducing false positives and alert fatigue
  • The drawbacks of relying exclusively on machine learning
  • How combining human expertise and automation yields a best-of-all-worlds approach

…and much more. 

In future chapters, we’ll also share: 

  • How to apply unsupervised machine learning models for detecting data issues
  • How to implement notifications while avoiding alert fatigue
  • How to integrate data quality monitoring with data catalogs, orchestration layers, and other systems
  • How to deploy, manage, and maintain your monitoring solution

Note that the contents of this preview will almost certainly change as we continue to craft the book and get feedback from early readers. If you have ideas to share or notice missing content, please let our editorial team know by reaching out to gobrien@oreilly.com.

Written By
The Anomalo Team
Try Anomalo with your team for free.
Lorem ipsum dolor sit amet, cour adipiscing elit ullam congue.
Data observability might be sufficient if you’re in the early innings of your data journey, but if you’re using data to make decisions or as an input into ML models, as our customers are, then basic checks are not enough to ensure your data is accurate and trustworthy.
Jobin George
Staff Solutions Consultant, Cloud Partner Engineering