How companies around the world apply machine learning

Strata Data London will introduce technologies and techniques; showcase use cases; and highlight the importance of ethics, privacy, and security.

By Ben Lorica
April 3, 2018
A record of the training of the ATS, by Anthony Gross A record of the training of the ATS, by Anthony Gross (source: Imperial War Museum on Wikimedia Commons)

The growing role of data and machine learning cuts across domains and industries. Companies continue to use data to improve decision-making (business intelligence and analytics) and for automation (machine learning and AI). At the Strata Data Conference in London, we’ve assembled a program that introduces technologies and techniques, showcases use cases across many industries, and highlights the importance of ethics, privacy, and security.

  • We are bringing back the Strata Business Summit, and this year, we have two days of executive briefings.

    Learn faster. Dig deeper. See farther.

    Join the O'Reilly online learning platform. Get a free trial today and find answers on the fly, or master something new and useful.

    Learn more
  • Data Science and Machine Learning sessions will cover tools, techniques, and case studies. This year, we have many sessions on managing and deploying models to production, and applications of deep learning in enterprise applications.

  • This year’s sessions on Data Engineering and Architecture showcases streaming and real-time applications, along with the data platforms used at several leading companies.

Privacy and security

The enforcement date for the General Data Protection Regulation (GDPR) is the day after the end of the conference (May 25, 2018) and for the past few months, companies have been scrambling to learn this new set of regulations. We have a tutorial and sessions to help companies learn how to comply with GDPR. Implementing data security and privacy remain foundational, but one of the key changes advanced by GDPR—“privacy-by-design”—will require companies to reassess how they design and architect products.

Unlocking popular data types: Text, temporal data, and graphs

The need for scaleout and streaming infrastructure can often be traced back to the importance of text, temporal data, and graphs. After one sets up infrastructure for collecting, storing, and querying these data types, the next step is to uncover interesting patterns or to use them to make predictions. Over the past year, companies have been turning to machine learning, in many cases to deep learning, when faced with large amounts of text, graphs, or temporal data. On the infrastructure side, we have sessions from members of some of the leading stream processing and storage communities.

Data platforms

How do some of the best companies architect and develop data platforms that help accelerate innovation and digital transformation? In a series of sessions, companies will share their internal platforms for business intelligence and machine learning. These are battle-tested platforms used in production, some at extremely large scale.

Many of these data platforms encourage collaboration and sharing of data, features, and models. In addition, since we’re very much in an empirical era for machine learning, tools for running and reproducing experiments, and for exploring the space of algorithms, are essential. Security and privacy become even more critical in light of the upcoming enforcement date for GDPR.

Machine learning: From data preparation and integration, to model deployment and management

Media articles on machine learning over emphasize algorithms and models. The reality is that model building is just one aspect of building products that rely on machine learning. In a vast majority of cases, machine learning applications require training data (“labeled data is the new, new oil”). To that end, the first step is to bring together existing data sources and when appropriate, enrich them with other data sets. In most cases, data needs to be refined and prepared before it’s ready for analytic applications.

While prototypes are easy to cook up, building production-grade applications requires serious engineering. Companies have come to realize that deploying, monitoring, and managing models in production requires different skills and a different mindset. So much so that a new breed of workers, machine learning engineers, was recently forecasted to be the fastest growing, emerging job in 2018.

Industry- and domain-specific tools, methods, and use cases

The best way to understand the potential of data technologies and machine learning is to see how companies are using them in real-world applications. One of the great things about Strata Data London is that one gets to hear case studies from many different industries and countries. You will learn how a variety of companies from across the globe have designed their data infrastructures; how they are incorporating machine learning; and how they’re approaching data privacy, security, and ethics. This year, we have keynotes and sessions on important topics in public policy, health care, manufacturing, and many other sectors of the economy. Here are some examples:

The Strata Data Conference is happening in London from May 21-24, 2018—best price ends April 6.

Post topics: Data science