A Cost/Benefit Analysis of Observability

Written by Ian Tinney

September 21, 2021

Businesses need to rationalise the rapid rollouts caused by the pandemic, but how can they do this while conserving costs?


Accelerated digital transformation has driven investment in infrastructure, developers, and updated tooling, enabling businesses to deploy applications faster across diverse environments. This has generated a massive increase in the data volumes flowing into logging analytics systems, like security information and event management (SIEM) platforms and application performance management (APM) tools. Businesses now have to address this massive increase in data.

In the main, this accelerated digitalisation hasn’t been orderly or well planned, forcing companies to grapple with decisions made quickly to cope with the pandemic, from cost overruns related to container deployments and logging analytics to understanding dynamic infrastructure landscapes. They’re having to absorb complexity across the technology landscape and reinforce and re-establish their governance and security measures in a bid to correct the rapid expansion. Yet further change may be met with resistance unless you can justify them.

Why complexity is increasing

Modern applications are composed of hundreds or thousands of services that are developed and tested independently by teams that may not communicate with each other. Software deployments may be automated, occurring several times a day across production environments. Each service may also have its own database and data model, which is also independently managed, further increasing complexity. Add in short-lived containers and dynamic scaling, and it’s easy to understand why the only time companies can test their applications is when they’re deployed in front of customers.

Traditionally, infrastructure and operations teams would deploy monitoring to gain visibility into their environments. The challenge is that monitoring hasn’t kept pace with this complex environment for three reasons:

  1. Exorbitant costs force teams to compromise on what they’re monitoring. Forced into decisions about which logs, metrics, and traces to keep in order to stay within budget, teams simply can’t store everything they need to observe their environment.
  2. Pre-built dashboards and alerts don’t reflect today’s infrastructure reality. Systems scale dynamically, and DevOps teams may deploy code across thousands of containers dozens of times each day. The static views offered by traditional monitoring systems don’t reflect this reality.
  3. Monitoring is a point solution targeting a single application or service. A failure in one service cascades to others, and unravelling those errors is well beyond the scope of monitoring applications

Why monitoring alone isn’t enough

ITOps and SecOps teams must evolve from monitoring into Observability. Observability is the characteristic of software and systems allowing them to be “seen” and to answer questions about their behaviour. Unlike monitoring, which relies on static views of static resources, observable systems invite investigation by unlocking data from siloed log analytics applications.

Implementing Observability requires collecting and integrating data from complex systems, which is where the observability pipeline comes in. An observability pipeline decouples the sources of data from their destinations. This decoupling allows teams to enrich, redact, reduce and route data to the right place for the right audience. The observability pipeline gets you past what data to send and lets you focus on what you want to do with it.

Reaping the benefits

An observability pipeline makes debugging faster by allowing you to ask “what if” questions of the environment rather than the pre-calculated views prevalent in monitoring solutions. Faster debugging and root cause analysis means fewer customers experience errors in production, which then drives up sales.

Another benefit of an observability pipeline is rationalising infrastructure costs. Often, the team deploying infrastructure isn’t the team paying for it, resulting in over-provisioned infrastructure. Collecting performance data, even for transient infrastructure like containers, gives ITOps teams visibility into how many resources are actually being consumed and where optimisations are possible.

For example, reducing data before sending it on to its destination can significantly reduce the TCO of your analysis tool. Even taking into account the investment in the Cribl Logstream observability pipeline will generate savings of at least 10% of Splunk. Minimising data retention on expensive storage can also result in huge savings. We’ve seen one company save 93% of their storage costs in this way.

In fact, businesses implementing an end-to-end observability pipeline can lower infrastructure costs by 30% and resolve issues four times faster than competitors, improving customer satisfaction and increasing spend by 15%.

To understand how an observability pipeline could rationalise your infrastructure and data analysis and the probable cost savings, please contact us via email at info@4datasolutions.com or on 0330 128 9180 for a one-to-one consultation.

Follow Us