Use cases
outcome
$100k
in operational cost savings
50% reduction
in time spent writing pipeline tests
90% increase
in data quality monitoring coverage
tech stack
Learn how the SimpleRose team uses Bigeye to deliver reliable data for large-scale data operations.
Using data to solve complex business optimization problems
SimpleRose develops powerful optimization software that can assist a wide range of organizations, from manufacturers to airlines, in solving high-value logistics, scheduling, and production problems that result in hundreds of millions of dollars in improvements to the bottom line.
SimpleRose is a small firm that ingests and manipulates large volumes of data—reaching over one hundred terabytes—to both power their product and inform strategic business decision making. The data operation at SimpleRose includes bringing raw data into S3 from multiple internal and external sources and running ETL jobs to normalize and load the data into Redshift, with Talend providing additional orchestration and transformation.
Challenge
SimpleRose has a small, agile data team that maintains high-volume, business-critical data pipelines. Given the company’s reliance on data, the leadership team needed 100% assurance that the data was correct and complete. For SimpleRose Principal Data Architect, Nick Heidke, this meant manually writing and maintaining an ever-expanding list of pipeline tests, which would quickly become unscalable.
In addition, the team invested time and resources into creating a curated, single-source-of-truth data mart for business users and analysts. The data team worried that one ETL failure or bug could create inaccurate data, leading users to lose trust in the data mart and bypass it completely by pulling data directly from source databases. This would result in inconsistent data proliferation, governance issues, and a host of other problems. The SimpleRose team needed to find a fast, agile solution to these challenges so they could move on to other, higher-level business needs.
Solution
Nick knew traditional data quality wouldn’t work for SimpleRose, given the large amount of rule creation required and the cost of purchasing and maintaining a tool. While researching other options, he came across the concept of data observability. Nick had always assumed he would have to think through and define his own tests and was immediately intrigued by the idea of using AI/ML to profile data and automatically apply data quality and pipeline monitoring.
The SimpleRose team evaluated several data observability solutions and found that Bigeye had the right mix of capabilities, including Redshift support, out-of-the-box automatic data pipeline and quality checks, and the ability to easily create custom business-logic checks and granular monitoring. The team also appreciated how easy it was to get Bigeye up and running and the price point.
In just a few days, the SimpleRose team had deployed broad operational coverage of their data warehouse and applied numerous out-of-the-box data quality checks and alerts to their critical business tables. Nick and team were also able to start creating their own custom checks to monitor for specific business logic. If anything slipped through the cracks and caused a data issue, not only would the team find out about it before business users and fix it, they could also create new checks in Bigeye to ensure the issue didn’t occur again and harden the pipeline even further.
Results
For SimpleRose, Bigeye reduced the time spent on data pipeline monitoring by 50% while increasing pipeline and data quality monitoring by over 90%. Factoring out to over $100,000 in operational savings per year. Even more important to Nick, the organization’s trust in and utilization of data has grown dramatically—allowing the business to make better, faster decisions and deliver a superior customer experience.
share this case study