Scaling Data Science for the Industrial Internet of Things

Few aspects of computing are as much in demand as data science. It underlies cybersecurity and spam prevention, determines how we are treated as consumers by everyone from news sites to financial institutions, and is now part of everyday reality through the Internet of Things (IoT). The IoT places higher demands on data science because of the new heights to which it takes the familiar “V’s” of big data (volume, velocity, and variety). A single device may stream multiple messages per second, and this data must either be processed locally by sophisticated processors at the site of the device or be transmitted over a network to a hub, where the data joins similar data that originates at dozens, hundreds, or many thousands of other devices. Conventional techniques for extracting and testing algorithms must get smarter to keep pace with the phenomena they’re tracking.

A report by ABI Research on ThingWorx Analytics predicts that “by 2020, businesses will spend nearly 26% of the entire IoT solution cost on technologies and services that store, integrate, visualize and analyze IoT data, nearly twice of what is spent today” (p. 2). Currently, a lot of potentially useful data is lost. Newer devices can capture this “dark data” and expose it to analytics.

This report discusses some of the techniques used at ThingWorx and two of its partners—Glassbeam and National Instruments—to automate and speed up analytics on IoT projects. These ...

Get Scaling Data Science for the Industrial Internet of Things now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.