Unifying streaming and stored data

The next frontier for data processing is a new platform capable of delivering insights, actions, and value the instant data is born.

bolts of light speeding through the acceleration tunnel 95535268

Credit: Thinkstock

IT spending is back, expected to regain 2019 levels by next year with digitization initiatives accelerated after COVID. Spending on enterprise software — including database, analytics, and business intelligence — will grow the fastest of all, Gartner says.

Deriving value from data, whether from insights or transactions, is at the root of improving business outcomes. That perennial popularity is why the database and analytics market is valued at nearly $200 billion.

Digitization has produced a new vector of value in the pursuit to derive value from data: real-time. A Forrester Consulting report found more than 80% of executives believe in the need for real-time decision-making based on instantaneous insights into events and market conditions.

And yet… there is a huge gap between desire and ability. More than two-thirds of the executives Forrester spoke to said their organizations were not able to obtain real-time, data-driven insights and actions.

A deluge of data

Digitization is producing a massive amount of real-time data. It’s pouring in from servers, devices, sensors, and IoT things, so much so that it is estimated that more data will be generated in the next three years than in the last 30.

All new data is born in real time. In that moment it contains the unique value relating to what just happened. However, that value is perishable as time passes and the data loses its time-based relevance.

Executives want to find value by leveraging real-time data, but most are failing due to fresh data overload. A significant majority of execs (70%) surveyed for the Dell Technologies 2020 Digital Transformation Index said their organizations are creating more data than they can analyze or understand.

With this deluge of real-time data comes a new macro challenge: a new type of data silo. Real-time processing requires different technologies than for stored data because the nature of the two types of data are very different:

The unique value of real-time data perishes within moments.
Real-time data tends to be an atomic payload without deeper context.
The informational values are different; one describes what just happened, and the other describes history.

In other words, while real-time data contains time-critical information about an event that just occurred, it lacks rich context that can be found in records of stored data.

What good is it to know that a specific customer just viewed a retail item online if that event cannot be combined instantaneously with the context of that unique customer’s profile and history? When a financial market transaction just occurred, how can its financial risk be profiled without combining it with the performance history of those involved in the transaction? When event data from a manufacturing sensor shows an aberrant blip, how can the need for preventive action be assessed without knowing the recent maintenance history?

The world of data has permanently changed. The dominant force is now real-time data, while rich contextual stores remain. This change agent presents powerful potential for creating valuable business outcomes — if it can be properly harnessed.

Not the database way

Databases sit between applications and historical data. They excel at performing transactions and queries on that stored data — but only for traditional applications. Both the functionality and the performance of databases were designed to address a previous era of expectations. Digitization has introduced a step-function change in performance requirements: microseconds now matter, and this is out of the reach of database architectures.

Additionally, databases were not designed to process real-time data that originated at Point A and is being transferred to Point B. They must therefore plug into engines that can perform that type of processing. These interfaces produce significant latency, which is the sworn enemy of real-time data, the value of which perishes quickly with time. Even if cobbling together multiple systems can be achieved, it introduces cost and architectural complexity that must be supported and maintained.

In order to unify the processing of real-time and stored data, a new category of data processing platform is needed. This platform must leverage the existence of databases and support applications that utilize both types of data.

This multi-function platform includes a streaming engine for ingestion, transformation, distribution, and synchronization of data. To meet the ultra-low latency requirements for data processing, this platform is necessarily based on in-memory technology. And to meet the dual requirements of scale and resilience, it must be a distributed architecture. With this combination, this platform can deliver sub-millisecond responses with millions of complex transactions performed per second.

A new data processing platform

We are now producing more fresh data than enterprises can process, and deriving value from it requires merging it with rich context from databases. It’s time to augment IT architectures to include a new data processing platform designed for the real-time world, one that can deliver insights and actions at the speed demanded by real-time digital operations to capture value at every moment.

Kelly Herrell is CEO of Hazelcast, the maker of a streaming and memory-first application platform for fast, stateful, data-intensive workloads.

—

New Tech Forum provides a venue to explore and discuss emerging enterprise technology in unprecedented depth and breadth. The selection is subjective, based on our pick of the technologies we believe to be important and of greatest interest to InfoWorld readers. InfoWorld does not accept marketing collateral for publication and reserves the right to edit all contributed content. Send all inquiries to newtechforum@infoworld.com.

Topics

About

Policies

Our Network

More

Unifying streaming and stored data

The next frontier for data processing is a new platform capable of delivering insights, actions, and value the instant data is born.

A deluge of data

Not the database way

A new data processing platform

Show me more

The dirty little secret of open source contributions

Strategies to navigate the pitfalls of cloud costs

And the #1 Python IDE is . . .

Building Python wheels to distribute your programs

Creating a pip install-able Python package

How to get better web requests in Python with httpx

Unifying streaming and stored data

The next frontier for data processing is a new platform capable of delivering insights, actions, and value the instant data is born.

A deluge of data

Not the database way

A new data processing platform

Related content

14 great preprocessors for developers who love to code

Designing the APIs that accidentally power businesses

Spin 3.0 supports polyglot development using Wasm components

Go language evolving for future hardware, AI workloads

Show me more

The dirty little secret of open source contributions

Strategies to navigate the pitfalls of cloud costs

And the #1 Python IDE is . . .

Building Python wheels to distribute your programs

Creating a pip install-able Python package

How to get better web requests in Python with httpx