As machine learning has become more widely adopted by businesses, O’Reilly set out to survey our audience to learn more about how companies approach this work. Do companies with more experience deploying machine learning in production use methods that differ significantly from organizations that are just beginning? For companies that haven’t begun this journey, are there any best practices that might help?
How are decisions and priorities set and by whom within the organization?
What methodologies apply for developing machine learning—for example, Agile?
What metrics are used to evaluate success?
Notable findings from the survey include the following:
Job titles specific to machine learning are already widely used at organizations with extensive experience in machine learning: data scientist (81%), machine learning engineer (39%), deep learning engineer (20%).
One in two (54%) respondents who belong to companies with extensive experience in machine learning check for fairness and bias. Overall, 40% of respondents indicated their organizations check for model fairness and bias. As tutorials and training materials become available, the number of companies capable of addressing fairness and bias should increase.
One in two (53%) respondents who belong to companies with extensive experience in machine learning check for privacy (43% across respondents from all companies). The EU’s GDPR mandates “privacy-by-design” (“inclusion of data protection from the onset of the designing of systems rather than an addition”), which means more companies will add privacy to their machine learning checklist. Fortunately, new regulations coincide with the rise of tools and methods for privacy-preserving analytics and machine learning.
One in two (51%) respondents use internal data science teams to build their machine learning models, whereas use of AutoML services from cloud providers is in low single digits, and this split grows even more pronounced among sophisticated teams. Companies with less-extensive experience tend to rely on external consultants.
Sophisticated teams tend to have data science leads set team priorities and determine key metrics for project success—responsibilities that would typically be performed by product managers in more traditional software engineering.