Thought leadership
-
January 23, 2023

Business, backed by data observability

Business runs better when its backed by data observability. But in the interest of "show, don't tell," let's point out some fairly recent examples.

Kyle Kirwan
Get Data Insights Delivered
Join hundreds of data professionals who subscribe to the Data Leaders Digest for actionable insights and expert advice.
Stay Informed
Sign up for the Data Leaders Digest and get the latest trends, insights, and strategies in data management delivered straight to your inbox.
Get the Best of Data Leadership
Subscribe to the Data Leaders Digest for exclusive content on data reliability, observability, and leadership from top industry experts.

Get the Best of Data Leadership

Subscribe to the Data Leaders Digest for exclusive content on data reliability, observability, and leadership from top industry experts.

Stay Informed

Sign up for the Data Leaders Digest and get the latest trends, insights, and strategies in data management delivered straight to your inbox.

Get Data Insights Delivered

Join hundreds of data professionals who subscribe to the Data Leaders Digest for actionable insights and expert advice.

Business runs better when its backed by data observability. But in the interest of "show, don't tell," let's point out some fairly recent examples.

In 2014, Etihad Airways mistakenly sold thousands of plane tickets from New York to Dubai for only $300. These “mistake fares,” where an airline accidentally offers lower-priced tickets than intended, were caused by data errors ingested into Etihad’s pricing algorithm. This oversight caused a huge dilemma: should they honor the fares but take a significant financial hit, or disregard them and risk consumer outrage?

In 2020, the fintech startup Brex relied on Plaid to connect to their customers’ bank accounts and determine their creditworthiness. These connections were brittle and would often disconnect, leaving Brex with stale data. Brex’s algorithm reacted to stale/missing data by immediately dropping credit limits. This action understandably led to unhappy customers. Brex’s data team eventually modified the underwriting algorithm to allow for some stale data. They also built more context into their algorithm - for instance that if a company had $100 million in their bank account a month ago, they probably had not gone bankrupt since.

Another example, from a company we've all heard of. In 2021, Zillow lost $550 million on its home-flipping program, Zillow Offers. Zillow Offers was powered by big-data analysis that told Zillow what to auto-offer for a house, and how much to charge on the flip. Simple, right? Until it wasn’t. In 2021, Zillow realized that it had bought thousands of houses at an overvalued rate. The whole program was underwater. Sales produced an average loss of $80,000 per house. The data sources for Zillow’s price forecasting were nowhere near as real-time and actionable as Zillow required.

Each of these scenarios above shows how bad data can lead to poor business decision-making, either explicitly through human judgment or implicitly through automated computer systems. When executives, employees, and microservices rely on data to make decisions, the cost of bad data is higher than ever.

The data observability differentiator

While bad data can seem like a white whale, there are concrete steps that improve your data quality and reduce the occurrence of data issues. After implementing simple testing, SQL checks, and other preliminary safeguards, it may be time to graduate to data observability.

What is data observability?

Data observability helps you monitor and understand the state of your data systems at all times. We can liken data observability to the dashboard on your car. It gives you a constant stream of information about how your system is functioning and whether any problems are being picked up.

Data observability platforms like Bigeye will provide some subset of:

  • Monitoring - Tracking data's volume, freshness, and quality
  • Anomaly detection - Detecting data points, events, and/or information that falls outside of a dataset’s normal behavior
  • Service Level Agreements (SLAs) - An agreement between a service provider and the customer that describes what will be delivered, the point of contact is for end-user problems, and the metrics that will determine and measure effectiveness of the project
  • Data lineage
  • Data governance

With these tools, organizations can answer questions such as:

  • Is customer data arriving on time?
  • Are there any duplicated transactions?
  • Is the decrease in average purchase size real or a data issue?
  • Will deleting a table from the data warehouse have any impact?

On a higher level, they help organizations prevent data quality issues or at least mitigate their impact on the business.

6 ways that data observability can improve your organization's decision-making

Given that companies often blame bad decisions (or lack of decisions) on bad data, investing in data observability can pay dividends. It's not just data and engineering teams that benefit. Here are some specific ways it impacts a company's strategic decision-making across the board:

1. Data is fresh and complete

Organizations should feel confident that they are acting on up-to-date and complete data. They build trust through fuller, more accurate insight into what's happening within the org and in the market at large.

2. Executives rely on the data

When data is trustworthy and reliable, executives will actually use it to inform their decision-making, rather than relying on gut instinct. This is especially true for executives in more traditional industries. This can lead to more evidence-based decision-making, resulting in better outcomes for the company.

3. Engineering productivity improves

Data observability prevents outages and other data-related issues. Data scientists and software engineers can focus on shipping new products and running new experiments, rather than being bogged down by data-related problems.

4. Marketers have a more accurate understanding of ROI on ad spend

With data observability in place, marketing teams get a clearer sense of how ad spend is performing. Over time they hone their ability to allocate resources properly and optimize their campaigns.

5. Finance teams get more accurate revenue projections

Finance teams can use data observability to make more accurate revenue projections, which can help to inform investment decisions and other financial planning.

6. Data scientists run sophisticated, accurate machine learning models

Companies can trust that data going into models and feeding automated decisions is trustworthy. As business decision-making increasingly moves from humans looking at dashboards, to machine learning systems, the stakes for data quality increase.

Suppose that an e-commerce company uses an AI chatbot for customer support. A customer asks for a refund, and the chatbot checks its records and issues the refund. With stale data, the customer might have already received their refund. The company has now double-paid and incurred a financial loss.

A final word

Data observability is, put simply, a smart business decision. When data is accurate, up-to-date, and available to those who need it, organizations make better decisions. Over time, a series of smart decisions turns into a competitive edge and long-term, decisive victories over your opponents in the market.

share this episode
Resource
Monthly cost ($)
Number of resources
Time (months)
Total cost ($)
Software/Data engineer
$15,000
3
12
$540,000
Data analyst
$12,000
2
6
$144,000
Business analyst
$10,000
1
3
$30,000
Data/product manager
$20,000
2
6
$240,000
Total cost
$954,000
Role
Goals
Common needs
Data engineers
Overall data flow. Data is fresh and operating at full volume. Jobs are always running, so data outages don't impact downstream systems.
Freshness + volume
Monitoring
Schema change detection
Lineage monitoring
Data scientists
Specific datasets in great detail. Looking for outliers, duplication, and other—sometimes subtle—issues that could affect their analysis or machine learning models.
Freshness monitoringCompleteness monitoringDuplicate detectionOutlier detectionDistribution shift detectionDimensional slicing and dicing
Analytics engineers
Rapidly testing the changes they’re making within the data model. Move fast and not break things—without spending hours writing tons of pipeline tests.
Lineage monitoringETL blue/green testing
Business intelligence analysts
The business impact of data. Understand where they should spend their time digging in, and when they have a red herring caused by a data pipeline problem.
Integration with analytics toolsAnomaly detectionCustom business metricsDimensional slicing and dicing
Other stakeholders
Data reliability. Customers and stakeholders don’t want data issues to bog them down, delay deadlines, or provide inaccurate information.
Integration with analytics toolsReporting and insights

Get the Best of Data Leadership

Subscribe to the Data Leaders Digest for exclusive content on data reliability, observability, and leadership from top industry experts.

Stay Informed

Sign up for the Data Leaders Digest and get the latest trends, insights, and strategies in data management delivered straight to your inbox.

Get Data Insights Delivered

Join hundreds of data professionals who subscribe to the Data Leaders Digest for actionable insights and expert advice.

Join the Bigeye Newsletter

1x per month. Get the latest in data observability right in your inbox.