Skip to content
All Blogs

Make better data accessible to more people with Celebrus and Teradata VantageCloud Lake

Author: Courtney Schwartzenburg

Better_Data_accessible_Celebrus_Teredata_Vantagecloud_lake

Data lakes were created to fill in the holes of databases and warehouses. Instead of processed and structured data that’s been filtered for a specific purpose, data lakes can hold a vast amount of raw data that hasn’t been defined and is in its original raw format.  

Because data lakes can hold so much information, it’s critical to optimize storage based on usage need. To help make storing the data more affordable, there are tiers in storage and pricing. Hot storage is data you need to access frequently and quickly and for those reasons, it’s the most expensive tier. Warm storage is data you might need access to, say, for a monthly report. You still need the data, but it isn’t as time sensitive. Cold storage is data used infrequently or that you don’t use, but don’t want to delete – making it the cheapest tier of storage.  

Why is it important to have tiers in data storage? 

Storing the data in data lakes is incredibly expensive. A tiered system is all about cost optimization, making sure the most used (and therefore most expensive) storage is allocated to the most used data.  

On top of the rising costs – thanks inflation! – data lakes are notoriously hard to manage. There’s so much information going into data lakes and so many queries and tasks that can be performed, even skilled engineers struggle to manage them. 

Why do some like it hot? 

Hot data is information you need on the fly. You use it daily or weekly and when you need it, there’s no time to wait. Imagine you work for a retail clothing store, and you’re responsible for monitoring customers’ online shopping carts. You need an instant notification if someone has abandoned their cart so you can trigger a response. It’s vital to the company that you get the most up-to-date information, and you need it in an instant or you’re going to lose the sale. If you have days (or even hours) old data, you don’t have the information you need to do your job. Data can go from hot to cold real quick. 

If you have the data in live-time, you can send an email reminder to your customer offering a coupon, or simply reminding them they have items left in their cart. Providing a sense of urgency in the moment is crucial to the sale. Being present with your customer on their journey is essential to the customer experience – and you need the most up to date information to accomplish that. 

But wait, won’t most data capture solutions provide you with the data you need in real-time? Nope. Only Celebrus can provide comprehensive data as it’s happening in live-time. If you need to know when the customer has abandoned their cart, but your data capture solution can't get that to you for two days, what can you do? Nothing. You’ve lost the sale. It’s hard to stay on top of trends if your data doesn’t arrive until the trend is over. 

Then why do I need warm and cold data storage? 

Warm data is perfect for training machine learning models which often require large data sets. Say you want to predict which customer is going to close their bank account in the next seven days, you can do that by understanding their behaviors on web and mobile devices if you’re able to track it. Knowing this information you can send corrective actions, like incentives or coupons, to the customer if you know you’re about to lose them. Of course, this only works if you have the best data. 

Cold tiers are basically an archive. It’s not something you need to access often, but it’s not something you want to delete either. It's data that needs to be kept around for regulatory reasons or to reference once in a blue moon. You don’t want to pay top dollar for data you might only need to access once a year. 

So how do data lakes help you get better data to more people?  

Well, not all of them do.  

“Data is only as valuable as its ability to be synthesized for actionable, real-world insights that drive better outcomes,” said Hillary Ashton, Chief Product Officer at Teradata. 

Let’s say you work for a company that sells T-shirts. You’re trying to create a new product and you need to know how many blue shirts you sold yesterday. If you’re using a data capture solution like Celebrus, paired with Teradata VantageCloud Lake, you can. The result of Celebrus live-time data and Teradata’s new ClearScape Analytics is the most in-database analytic functions anywhere in the market along with AI and ModelOps. This makes it possible to go into Teradata VantageCloud Lake to ask specific and relevant questions AND get the answers you need in live-time 

Now let’s pretend someone from marketing is trying to create a campaign around the new product at the same company. Marketers can go into the same system and get valuable marketing signals just as easily – eliminating the need for shadow IT systems. 

Here’s the tricky part, your data capture solution and data lake must be compatible. Some data capture solutions can’t keep up with new technology. For example, not every data capture solution will be able to automatically pair with Teradata’s new VantageCloud Lake, but forward-thinking data capture solutions, like Celebrus, can. 

Putting it all together – literally  

First, find the best live-time data available - because your data lake is only as good as the data that’s been captured. Make sure your data is the most complete and up-to-date as possible. Celebrus offers the only first-party, live-time, complete data and identity solution available. It’s able to capture and segment data as it’s happening to deliver the most accurate data into Teradata VantageCloud Lake in milliseconds. Then, VantageCloud Lake can use ClearScape Analytics to leverage that data to meet all the analytic demands of any organization.  

Unlike other data capture solutions, Celebrus collects high volume data in live-time, and actions it in a way that's impossible with other systems. Teradata then takes that information, organizes and analyzes it, and drives insight to create thousands of use cases from the data. Insight that would otherwise be unavailable in time to be useful – if at all.