Routing your data can allow you to be much more dexterous and ambitious in the cloud. Ian Tinney looks at why the way you route your data is so important.

Routing your data to where you need it has always been important, but it’s now becoming critical as cloud deployments become more complex and more fluid. Businesses need to be able to take advantage of new technologies but to do so, they must be able to access and move their data with ease. That means they need to avoid becoming locked into their cloud infrastructure provider and/or their data analytics platform provider.

Unlocking data

In the cloud, vendor lock-in is a real problem. It’s a term used to describe the stick and carrot behaviour of companies who encourage you to keep your data within their domain, sometimes by penalising you for getting your data out and of keeping you with them by offering a rich set of features.

What tends to happen is that either the supplier increases its prices over time or your usage simply increases to the point where the discomfort finally forces you to have to move your data elsewhere, but by this point, you may well have missed out on more innovative technologies or have curbed your ambitions rather than rolling out a hybrid or multi-cloud deployment.

In ‘The state of public cloud in the enterprise 2020’ by Contino Research, 63% of those questioned were ‘somewhat’ or ‘very afraid’ of getting locked in, and this is driving increased interest in multi-cloud deployments (where more than one cloud provider is used), with 20% having adopted this model.

Avoiding vendor lock-in is, therefore, a must, but it requires you to think about routing your data before it gets to the destination, where you pay costs.  Ideally, you want to be able to leverage existing agents and technologies that are already sending data but to intervene and route data to multiple destinations as you wish.  This avoids the costs associated with data egress for the public cloud.  If you do wish to change to use a different cloud provider, you will want to migrate one service at a time while keeping it in its current location to ensure continuity and avoid disruption.

Similarly, if you’re using an expensive data analysis tool and want to move some or all of your data to a different tool, e.g., Splunk, Elastic, Snowflake, Kafka etc., you’ll also need to generate and route a copy of your data.

Regular routing

Of course, you also need to route data to specific locations on a day-to-day basis for business purposes, be that to a real-time system, batch analytics store or object storage like S3 or Azure blob, for example. However, this can be complex and costly to achieve if it requires additional agents to be set up in the network. No one wants another agent, sending an identical copy of data to a different platform. The best way to overcome this is to have an ESP tool with routing capabilities that can utilise existing agents and sources of data, and then route the data on their behalf.

Being able to route data makes real commercial sense, especially when it comes to data analysis. Storage is expensive in this environment, so you only want data to be sat in the analytics platform when you intend to analyse it.

Sometimes you’ll want the same subset of data to be sent to multiple locations, and these may have their own formatting requirements. Perhaps you want to move data to two data analysis platforms, one of which requires data in a different format (e.g., JSON).  In this instance, you’ll want to route the selected data to both platforms simultaneously, changing the format en route for each platform.

Routing for compliance

Compliance mandates can require you to keep data for a set number of years, in which case you’ll need to be able to pull up and interrogate data upon request. The storage requirements for a modern analysis tool are at odds with the storage requirements for meeting compliance-driven, long-term data retention because the first requires fast, expensive storage while the second requires slow, inexpensive storage.

To solve this disparity, you’ll need to be able to move the data between both types of storage. For example, you could send all data to your expensive, fast analysis tool while sending the same or slightly reduced data to your slow, long-term storage solution to keep it for years. But what if an auditor then asks you to supply data from three years ago? Then you’ll need to be able to search the object-store, retrieve the relevant data, and add metadata (e.g., to divert it to a specific index) before routing it to your preferred data analysis tool.

Freedom and flexibility

All of these capabilities can be performed using an Event Stream Processor (ESP) tool such as Cribl’s LogStream. This solves the problem of vendor lock-in by allowing you to route your data to where you want, possibly with some transformation, without the need for additional agents. It solves the problems of platform migration and storage disparity by allowing you to create copies of your data which can be sent to multiple locations. And it enables you to meet your compliance obligations by archiving and then retrieving data for analysis.

Whatever your reasons for needing to route data to different destinations, being able to communicate with all of the leading data analysis products and technologies means that you can rest assured that when things change, you can quickly take advantage of new and different technologies using LogStream.

To explore how you can use LogStream to route your data, contact us today for a one-to-one consultation.