Key considerations for building an AI platform

The right AI solution is the one that fits the skill set of the users and solves the highest-priority problems for the business.

By Simon Chan
October 25, 2017
Yin, yang Yin, yang (source: Pixabay)

The promises of AI are great, but taking the steps to build and implement AI within an organization is challenging. As companies learn to build intelligent products in real production environments, engineering teams face the complexity of the machine learning development process—from data sourcing and cleaning to feature engineering, modeling, training, deployment, and production infrastructure. Core to addressing these challenges is building an effective AI platform strategy—just as Facebook did with FBLearner Flow and Uber did with Michelangelo. Often, this task is easier said than done. Navigating the process of building a platform bears complexities of its own, particularly since the definition of “platform” is broad and inconclusive. In this post, I’ll walk through the key considerations of building an AI platform that is right for your business, and avoiding common pitfalls.

Who will use the platform?

Machine learning platforms are often casually advertised as designed for both software engineers and data scientists. Most of them, however, fail to address both roles well and at the same time. Even worse, they don’t offer enough value to either side to be useful for real work. My experience in building PredictionIO and contributing to Salesforce Einstein AI has helped me understand two distinct groups of practitioners who have diverged sets of requirements in mind.

Learn faster. Dig deeper. See farther.

Join the O'Reilly online learning platform. Get a free trial today and find answers on the fly, or master something new and useful.

Learn more

First, there is the data scientist group. These users usually have a math and statistics background, and are heavy users of tools like R and Python’s scientific packages and data visualization tools. This group is responsible for analyzing data and tuning models for the best accuracy, so they’re concerned about whether the platform supports a specific class of algorithms, that it works well with data analysis tools they are already familiar with, and that it integrates with the visualization tools they use. They also want to know what feature engineering techniques it supports, whether they can bring in their own pre-trained models, and so on.

Then there is the software developer group. These users are generally familiar with building web and mobile applications, and are more concerned with whether the platform integrates with the desired data sources, and if the provided programming interfaces and built-in algorithms are sufficient to build certain applications. They want to know how to retrieve model results, whether model versioning is supported, if there is a software design pattern to be followed, and so on.

To implement an AI platform successfully for your organization, you must truly understand your users and do the right heavy lifting accordingly. For example, there are many data scientists who prefer to fine-tune every algorithm parameter manually, but if your users expect out-of-the-box regression that just works, then automated model tuning may become an essential technology of the platform. You want to help these users avoid the hassle of tuning the regularization parameters so they can focus on their top priorities.

Are you solving for simplicity or flexibility?

You may wonder why it is so difficult to build a single AI platform that serves two or more personas well. Can’t we simply offer more functionality on the platform? The problem boils down to the tough choice between simplicity and flexibility. It is more of an art than a science to determine which parts should be abstracted away for simplicity and which parts should be made customizable for flexibility.

For some users, an ideal platform is one that abstracts away all data science details. Many software engineers happily utilize the power of Salesforce Einstein’s deep learning APIs to recognize objects in images and to classify sentiment in text without worrying about how the AI model is built, or even which algorithm is being used behind the scenes.

For other users, an ideal platform is one that allows a maximum level of flexibility. Many software engineers like to build completely custom AI engines on Apache PredictionIO. They get their hands dirty modifying the Spark ML pipelines and enjoy the freedom to tailor-make and fine-tune every component— from data preparation and model selection to real-time serving logics— in order to create a unique AI use case.

How do you balance product and engineering decisions?

As an AI platform is adopted by more and more users, many tough but interesting product and engineering decisions are revealed. What should the platform measure? Should it offer built-in metrics? How should it handle the very different requirements for AI R&D, development and production purposes? What’s the cost versus scalability strategy? Should the platform be cloud agnostic? Should visualization tools be part of the platform? To answer these questions most effectively, you must focus on one complete use case for one type of user at a time, starting with the highest priorities for the business.

What Is your multi-layer approach?

Sometimes, the reality is that you do need to construct AI offerings for various types of users. In that case, the separation of offerings must be explicit.

For instance, the Salesforce Einstein artificial intelligence layer has three main components. First, there are several independent services relating to machine learning development. One service is for executing resource-intensive jobs, and is responsible for intelligently allocating and managing distributed computing units to each job. Another service is for scheduling jobs, managing their dependencies, and monitoring the status. These low-level services give data scientists and software engineers the maximum flexibility to build AI solutions in whatever ways they like.

Second, there is an application framework that standardizes the design pattern for some types of common AI applications— specifically in Salesforce’s case, multitenant AI applications. Users will still need to write code, but they write far less of it because many common functionalities are abstracted away. In exchange for some flexibilities, the platform offers resilience and scalability to the AI applications built on top of it.

Third, APIs and user interfaces are provided so that the platform can be useful to users who write very little code, or even no code, to build AI applications.

Conclusion

Companies that do not thoroughly think through their AI strategies often swing from one direction to another one. They are chasing the wind. The growing demand for AI platforms to serve various types of development is inevitable in the foreseeable future, and the right solution is the one that fits the skill set of the users and solves the highest-priority problems for the business.

Post topics: Data science
Share: