A global digital media platform serving hundreds of millions of users transformed their approach to data quality with Anomalo. By deploying comprehensive, automated monitoring across their Snowflake data infrastructure, the company uncovered a critical four-month data gap that had gone completely undetected by their existing tools, and began receiving proactive, analyst-grade insights their team had never thought to ask for. The result: engineering time shifted from reactive firefighting to strategic work, data trust improved across ten internal teams, and autonomous agents began surfacing business-valuable findings before anyone requested them.

Company Background
The company operates one of the world’s largest fan and content platforms, hosting communities and reference content for entertainment franchises across movies, TV, games, and more. With massive volumes of user-generated content and event data flowing through their systems daily, data reliability is critical across analytics, product, and business operations.
Their data infrastructure spans multiple data warehouses, including Snowflake, Athena, and Presto, with event streams from third-party platforms and approximately ten data teams across the organization, all dependent on the same underlying data being accurate and available.
The Challenge
- Invisible data gaps: The most significant risk wasn’t noisy alerts or misconfigured pipelines, it was silence. Data pipelines appeared to be functioning normally, with no errors surfaced by their orchestration tooling, yet actual data quality issues went completely undetected. The team had no reliable way to confirm that data was not just moving, but arriving and looking right.
- Manual overhead at scale: Managing data quality across a growing infrastructure required significant manual effort. Moving configurations between environments was tedious, time-consuming work, the kind that consumed full days without adding strategic value.
- Alert fatigue and signal loss: Without intelligent prioritization, teams struggled to separate meaningful issues from noise, making it harder to respond quickly when something genuinely required attention.
- Reactive posture by default: Without proactive monitoring in place, the team operated in reactive mode, discovering data quality issues only after they had already propagated into downstream analytics, dashboards, or business decisions.
The Critical Discovery
Shortly after extending Anomalo monitoring to a set of newly acquired tables following a company acquisition, the team made a discovery that validated their investment in data quality.
The acquired team’s pipelines had been fully instrumented and appeared healthy. Jobs ran on schedule. Orchestration tools showed no failures. No downstream alert had fired. No stakeholder had complained. Everything looked normal.
When Anomalo began watching the data itself, not just the pipeline execution, it found that for four months, not a single row of data had been arriving. The pipeline was executing. The data simply wasn’t there.
As the Director of Data & ML Engineering described it: the pipelines were working and showing all green lights in Airflow. It wasn’t until they started looking at the data themselves that the problem surfaced, and by then, four months had passed with no data coming in at all.
This discovery made something concrete that had previously been theoretical: orchestration tools confirm a job ran. Only content-level monitoring confirms data actually arrived and looks right. Green lights in Airflow say nothing about what came out the other end.
The Solution
The company deployed Anomalo across their Snowflake data infrastructure, enabling automated, ML-driven monitoring across hundreds of tables without writing a single rule by hand.
- Automated monitoring at scale: Anomalo’s profiling and prediction engine learned what normal looked like for each monitored table, covering freshness, volume, schema consistency, and content-level data behavior, without any manual rule configuration.
- Infrastructure-as-code configuration management: What had previously required a full day of manual environment-by-environment work was reduced to minutes using Anomalo’s configuration tooling. Teams could replicate and manage monitoring setups efficiently as the data estate grew.
- Custom SQL checks for domain-specific logic: For tables where business rules required specialized validation, including revenue data, teams encoded that domain knowledge directly into Anomalo’s custom check framework, maintaining ownership of the logic closest to the data.
- Proactive alerting and escalation: Integration with PagerDuty and Slack enabled the team to route alerts by severity and ensure the right people were notified without overwhelming them with noise.
- The Data Insights Agent: After deploying Anomalo’s autonomous Data Insights Agent, the team began receiving findings that hadn’t been requested. Two stood out immediately. The first: two insights from unrelated tables that, when surfaced together, pointed to a monetization opportunity significant enough to route directly to the pricing team. The second: a slow-moving trend, unidentified page views creeping upward over weeks, that day-over-day anomaly detection would have missed entirely, but which had nearly doubled over a month. Both findings arrived pre-investigated, with context and timeline attached, so the receiving team could act rather than dig.
Key Outcomes
- Prevented critical data loss incidents: A four-month silent data gap, invisible to every existing monitoring tool, was caught immediately once Anomalo began watching the content of the tables, not just the pipeline status.
- Eliminated manual configuration overhead: Configuration migrations that previously required a full day of manual work were reduced to minutes, freeing engineering time for higher-value work.
- Expanded data quality ownership across the organization: Monitoring was distributed across ten data teams, with each team taking responsibility for the tables within their domain. Enabling business-unit ownership allowed domain expertise to be applied closer to the data.
- Surfaced unprompted business value: The Data Insights Agent identified business-significant findings, including a monetization signal and a month-long accumulation trend, that no one asked it to find. Findings arrived in a form ready for action, not further investigation.
- Improved data trust and confidence: With comprehensive monitoring in place, teams across the organization gained confidence in the data they were working with, spending less time questioning whether data was available and more time using it.
What Made the Difference
The clearest lesson from this engagement was the distinction between monitoring pipeline execution and monitoring data itself. Orchestration tools are designed to confirm that a job ran, and they do that well. But those tools have no visibility into whether the data that job was supposed to move actually arrived, and whether it looks right when it gets there.
Anomalo’s content-level monitoring fills that gap. By profiling actual data values across billions of rows and learning what normal looks like for each table, the system catches the issues that traditional tooling is structurally blind to, not because those tools are misconfigured, but because they were never built to see that layer.
The second lesson was about the shift from detection to delivery. Finding something is step one. Delivering it pre-investigated, with context, timeline, and causality already attached, is what makes action the next step instead of more analysis. When a finding arrives ready to route, the team receiving it can focus on the decision, not the diagnosis.
Request a Demo Contact Us