Real-time data for the real world

Close the time gap between analysis and action to bring about the next wave of improvements in efficiency and reliability—and magic.

By Timothy McGovern
November 1, 2016
Wind farm. Wind farm. (source: Pixabay)

The number of devices that translate real-world events into a constant stream of zeros and ones continues to expand faster than ever. To account for this huge influx of new data, the technology used to capture, process, and analyze this information also needs to keep up. Embracing real time is the only option for companies seeking to elevate both the speed and the quality of information their business is analyzing. This article explores the importance of real time and the future of data analysis made possible by real-time technology in the form of both predictive analytics and machine learning, with a special focus on the world of alternative energy and the Industrial Internet of Things (IIoT).

The world of renewable energy is a perfect example of the possibilities—and challenges—of the 21st-century economy. The promise of renewables is not just staunching the flow of environmental costs of the developed world’s lifestyle, but spreading economic growth globally without incurring these costs in the first place. The goal of transforming an industry to be more efficient and responsive is thrown into high relief in the case of energy. However, across all industries, the ability to intake, understand, and respond to data offers the possibility of radical change for the better, specifically in those very realms of efficiency and responsiveness.

Learn faster. Dig deeper. See farther.

Join the O'Reilly online learning platform. Get a free trial today and find answers on the fly, or master something new and useful.

Learn more

Up to now, the main benefits that data has brought to industries have been driven by analytical efforts separate from operational data pipelines. Business analysts examine data, find trends, propose tests, and deliver recommendations, which are then implemented by the operations side. But to continue gaining returns from data analysis, the timescales at which analysis operates have become finer and finer. In a transportation firm, for example, data analytics may be able to recommend new truck routes given traffic conditions. In order to implement those recommendations, however, the traffic conditions need to be recognized—and that analysis propagated to drivers—in real time. As “drivers” comes to mean “other computers,” this need remains central.

The machine learning techniques that data scientists have developed over the past decades have immense power to categorize, predict, and detect anomalies. The first step to speeding up the time-to-delivered-insight is real-time data access and insights dashboards. The next step is to adapt machine learning techniques to real-time data—this means not only supplying incoming data for analysis, but also developing responsive algorithms that update on the basis of new data.

Real-time analytics in the renewable energy industry

As mentioned, the renewable energy industry offers a great example of using real-time analytics to deliver efficiency and responsiveness to everyday industry operations. As we replace fossil fuels with renewables (solar, wind, and hydro), and seek to use less energy overall, it is clear that the interplay of numerous machines, human analysts, and timescales necessitate real-time data pipelines. For much of the 20th century, the grid’s electrical needs could be satisfied by fossil-fuel-burning power plants. While daily and seasonal fluctuations could be predicted at a coarse level, plants would have to overproduce to ensure sufficient power to handle unexpected spikes in demand. Brownouts and surges were distressingly common, and of course, more fuel was burned than was needed.

As renewables joined the grid, supply became much more complicated. While it takes a half hour or so to spin up a natural-gas turbine, solar panels and wind turbines depend on the weather; hydroelectric plants can be brought online in minutes, but need to pay attention to the long-term seasonal cycles of rain and snowfall, irrigation needs, and even fish migration. Meanwhile, the daily cycles of human energy needs are divorced from (and sometimes in conflict with) natural energy supplies: home air conditioners switching on in summer evenings as families return home, just as the sun sets and the wind dies down.

Real-time analysis—and action on—data at the level of the individual energy source and the individual energy consumer is necessary to balance all these factors. Tracking supply and demand at this granular level enables the grid to balance supply and load by the second—a responsiveness that shows up in the consumer experience as reliability, with reduced or eliminated power outages and spikes.

On the production side, data analysis provides reliability, too.  For example, one Internet of Things simulated application that tracks the status of nearly 200,000 wind turbines around the world shows how real-time data can empower businesses in the energy sector. This simulation provides not only real-time status dashboards, but also—crucially—predictive analytics that gives turbine operators insight into when machines are in danger of failing. This enables not only preventive maintenance, but predictive maintenance—servicing machines before they fail, and dispatching both personnel and material to where they’ll be needed.

Where does real time go next? From ML to AI

It’s almost hidden in that last paragraph, but bringing analytics into real time gets you a reach into the future, a bit of everyday magic. O’Reilly’s Mike Loukides and Ben Lorica recently discussed “What is Artificial Intelligence?” and pointed out that technologies creep from being “robots” or “AI” into our everyday lives as they’re implemented and become familiar. The bottom line to this transfer of technology from fantasy to reality is that it works seamlessly and in the same time frame as the real world.

Real-time data analytics and machine learning provide the keys to deploying more and more services that offer “magical” abilities. By synthesizing not only vast amounts of information, but vast amounts of current information, real-time data contextualizes current observations with historical data, and delivers either action or augmentation to human decision-makers. 

In today’s data-driven world, it’s easy to get caught up in the latest machine learning techniques or get deeper and deeper into feature engineering. In moments of sober realism, we spend time (commiserating over) data wrangling. But even this focus on the “dirty work” of data science misses the underlying data pipelines. The engineering work of getting data from sensors and users to databases and algorithms, and in turn getting analyses and decisions back to users (or actors like robots) is often the most important work in enabling data-driven action, especially in real-time contexts.

This post is a collaboration between MemSQL and O’Reilly. See our statement of editorial independence.

Post topics: Data science