AI adoption in the enterprise 2020
O’Reilly survey results show that AI efforts are maturing from prototype to production, but company support and an AI/ML skills gap remain obstacles.
Last year, when we felt interest in artificial intelligence (AI) was approaching a fever pitch, we created a survey to ask about AI adoption. When we analyzed the results, we determined the AI space was in a state of rapid change, so we eagerly commissioned a follow-up survey to help find out where AI stands right now. The new survey, which ran for a few weeks in December 2019, generated an enthusiastic 1,388 responses. The update sheds light on what AI adoption looks like in the enterprise— hint: deployments are shifting from prototype to production—the popularity of specific techniques and tools, the challenges experienced by adopters, and so on. There’s a lot to bite into here, so let’s get started.
Key survey results:
- The majority (85%) of respondent organizations are evaluating AI or using it in production. Just 15% are not doing anything at all with AI.
- More than half of respondent organizations identify as “mature” adopters of AI technologies: that is, they’re using AI for analysis or in production.
- Supervised learning is the most popular ML technique among mature AI adopters, while deep learning is the most popular technique among organizations that are still evaluating AI.
- Though a problem, the lack of ML and AI skills isn’t the biggest impediment to AI adoption. Almost 22% of respondents identified a lack of institutional support as the most significant issue.
- Few organizations are using formal governance controls to support their AI efforts.
The takeaway: AI adoption is proceeding apace. Most companies that were evaluating or experimenting with AI are now using it in production deployments. It’s still early, but companies need to do more to put their AI efforts on solid ground. Whether it’s controlling for common risk factors—bias in model development, missing or poorly conditioned data, the tendency of models to degrade in production—or instantiating formal processes to promote data governance, adopters will have their work cut out for them as they work to establish reliable AI production lines.
Survey respondents represent 25 different industries, with “Software” (~17%) as the largest distinct vertical. The sample is far from tech-laden, however: the only other explicit technology category—“Computers, Electronics, & Hardware”—accounts for less than 7% of the sample. The “Other” category (~22%) comprises 12 separate industries.
Data scientists dominate, but executives are amply represented
One-sixth of respondents identify as data scientists, but executives—i.e., directors, vice presidents, and CxOs—account for about 26% of the sample. The survey does have a data-laden tilt, however: almost 30% of respondents identify as data scientists, data engineers, AIOps engineers, or as people who manage them. What is more, almost three-quarters of survey respondents say they work with data in their jobs. All told, more than 70% of respondents work in technology roles.
Close to 50% of respondents work in North America, most of them in the United States, which by itself is home to almost 40% of survey participants. Western Europe (~23%) was the next largest region, followed by Asia at 15%. Participants from South America, Eastern Europe, Oceania, and Africa account for roughly 15% of responses.
Analysis: The state of AI adoption today
More than half of respondent organizations are in the “mature” phase of AI adoption (using AI for analysis/production), while about one-third are still evaluating AI. This is close to a mirror image of last year’s AI survey results, when 54% of respondent organizations were evaluating AI and just 27% were in the “mature” adoption phase. This year, about 15% of respondent organizations are not doing anything with AI, down ~20% from our 2019 survey.
The upshot is that 85% of organizations are using AI, and (of these) most are using it in production. It seems as if the experimental AI projects of 2019 have borne fruit. But what kind?
The bulk of AI use is in research and development—cited by just under half of all respondents—followed by IT, which was cited by just over one-third. (Respondents were encouraged to make multiple selections.) Another high-use functional area is customer service, with just under 30% of share. Two functional areas—marketing/advertising/PR and operations/facilities/fleet management—see usage share of about 20%. Clearly respondent organizations see the value of AI in a raft of different functional organizations, and the flat results from last year show a consistency to that pattern.
Common challenges to AI adoption
The acquisition and retention of AI-specific skills remains a significant impediment to adoption in most organizations. This year, slightly more than one-sixth of respondents cited difficulty in hiring/retaining people with AI skills as a significant barrier to AI adoption in their organizations. This is down, albeit slightly, from 2019, when 18% of respondents blamed an AI skills gap for lagging adoption.
Believe it or not, a skills gap isn’t the biggest impediment to AI adoption. In 2020, as in 2019, a plurality of respondents—almost 22%—identified a lack of institutional support as the biggest problem. In both 2019 and 2020, the AI skills gap actually occupied the No. 3 slot; this year, it trailed “Difficulties in identifying appropriate business use cases,” which was cited by 20% of respondents.
A more detailed look at the bottleneck data shows executives selecting an unsupportive culture less often (15%) than the practitioners and managers (23%) who responded to the survey.
By a 2:1 margin, respondents in companies that are evaluating AI are much more likely to cite an unsupportive culture as the primary bulwark to AI adoption. This disparity is striking—and intriguing. Is it just the case that late-adopters are ipso facto more resistant to—less open to—AI?
By contrast, AI adopters are about one-third more likely to cite problems with missing or inconsistent data. We saw in our “State of Data Quality in 2020” survey that ML and AI projects tend to surface latent or hidden data quality issues, with the result that organizations that are using ML and AI are more likely to identify issues with the quality or completeness of their data. The logic in this case partakes of garbage-in, garbage out: data scientists and ML engineers need quality data to train their models. Companies evaluating AI, by contrast, may not yet know to what extent data quality can create AI woes.
AI/ML skill shortages: Consistent and persistent
We asked survey respondents to identify the most critical ML- and AI-specific skills gaps in their organizations. The shortage of ML modelers and data scientists topped the list, cited by close to 58% of respondents. The challenge of understanding and maintaining a set of business use cases came in at number two, cited by almost half of participants. (Survey takers could choose more than one selection.) Close to 40% selected data engineering as a practice area for which skills are lacking. Finally, just under one quarter highlighted a lack of compute infrastructure skills.
The most remarkable thing about these results is their year-over-year consistency. The same skill areas that were problematic in 2019 are again problematic in 2020—and by about the same margins. In 2019, 57% of respondents cited a lack of ML modeling and data science expertise as an impediment to ML adoption; this year, slightly more—close to 58%—did so. This is true of other in-demand skills, too. The uncomfortable truth is that the most critical skill shortages cannot easily be addressed. The data scientist, for example, is a hybrid creature: ideally, she should possess not only theoretical and technical expertise, but practical, domain-specific business expertise, too.
This last is almost always acquired in practice, with the result that the freshly minted data scientist is invariably trained on the job. This helps explain why the proportion of respondents who cited a shortage of people skilled in understanding and maintaining business use cases increased year over year, from 47% in 2019 to 49% this year. The data scientist uses her domain-specific expertise to identify appropriate business use cases for AI. The ML modeler supplements her technical competency with domain-specific business knowledge that she accrues in practice. Both types of practitioner must also develop soft skills in team work, listening, and, most important, empathy. This takes time and is a function of experience.
Managing AI/ML risk
We asked respondents to select all of the applicable risks they try to control for in building and deploying ML models. The results suggest that all organizations—especially those with “mature” AI practices—are alert to the risks inherent in the design and use of ML and AI technologies.
Unexpected outcomes/predictions was the single most common risk factor, cited by close to two-thirds of mature—and by about 53% of still-evaluating—AI practitioners. Among mature adopters, the need to control for the interpretability and transparency of ML models was the second most common risk factor (cited by about 55%); by contrast, a different option—fairness, bias, and ethics (~40%)—was the No. 2 risk factor among companies still evaluating AI. It ranks high (No. 3) with mature AI practitioners, too: ~48% check for fairness and bias during model building and deployment.
Mature AI practitioners are significantly more likely to implement checks for model degradation than companies that are still evaluating AI. Model degradation is the No. 4 risk factor among mature adopters (checked for by about 46%); however, it is next to last among organizations that are in the evaluation phase of AI adoption—finishing ahead of the “Other compliance” category.
These risk factors are common, well understood, and don’t stand alone. With respondents able to pick “all that apply” to the question, we find that 41% of respondents list at least four issues, and 61% select at least three issues.
Supervised learning is dominant, deep learning continues to rise
Supervised learning remains the most popular ML technique among all adopters. In 2019, more than 80% of mature adopters—and two-thirds of respondent organizations that were then evaluating AI—used it. And in 2020, almost 73% of self-identified “mature” AI practices are using it. (The survey questionnaire encouraged respondents to select all applicable techniques.)
This year, however, deep learning displaced supervised learning as the most popular technique among organizations that are in the evaluation phase of AI adoption. To wit: in respondent organizations that are evaluating AI, slightly more say they’re using deep learning (~55%) than supervised learning (~54%). And close to 66% of respondents who work for “mature” AI adopters say they’re using deep learning, making it the second most popular technique in the mature cohort—behind supervised learning.
It’s true that usage of all ML or AI techniques is greater among mature adopters than among organizations still evaluating AI. That said, there are a number of striking differences between mature and less mature AI adopters. For example, about 23% of “mature” AI practices use transfer learning, nearly double the rate of usage in less mature practices (12%). Human-in-the-loop AI models are considerably more popular among mature users than among those still evaluating AI.
Selecting the right tool for the job has more than three-quarters (78%) of respondents selecting at least two of ML techniques, 59%, using at least three, and 39% choosing at least four.
The dominant tools aren’t getting any less dominant
TensorFlow remains, by far, the single most popular tool for use in AI-related work. It was cited by almost 55% of respondents in both 2019 and 2020, which gives it a creditable consistency over time.
TensorFlow’s staying power also reinforces the fact that deep learning and neural networks—with which it is strongly associated—are far from niche techniques.
The most popular tools for AI development in 2019 were once again predominant in 2020. This could be a function of what we’ll call the “Python factor,” however: four of the five most popular tools for AI-related work are either Python-based or dominated by Python tools, libraries, patterns, and projects.
Of these, TensorFlow, scikit-learn, and Keras held steady, while PyTorch grew its share to more than 36%. This tracks with usage and search activity on the O’Reilly online learning platform, where interest in PyTorch has grown quickly from a relatively small base. Our analysis of Python-related activity on O’Reilly likewise shows that Python is seeing explosive growth in ML and AI-related development.
Data governance isn’t yet a priority
Slightly more than one-fifth of respondent organizations have implemented formal data governance processes and/or tools to support and complement their AI projects. This is consistent with the results of our data quality survey.
The good news is that just over 26% of respondents say their organizations plan to instantiate formal data governance processes and/or tools by 2021; almost 35% expect this to happen in the next three years. The bad news is that AI adopters—much like organizations everywhere—seem to treat data governance as an additive rather than an essential ingredient.
Think of data governance as analogous to observability in software development: it is easier to build a capacity for observability into a system than to retrofit an existing system to make it observable. In the same way, it is easier to build a capacity for data governance into a system or service than to “add” it after the fact. Data governance is a data-specific take on observability that not only permits traceability and reproducibility, but permits transparency into what an AI asset is doing—and how it’s doing it.
A review of the survey results yields a few takeaways organizations can apply to their own AI projects.
- If you do not have plans to evaluate AI, it’s time to think about catching up. With an abundance of open source tools, libraries, tutorials, etc., not to mention an accessible lingua franca—Python—the bar for entry is actually pretty low. Most companies are experimenting with AI—why risk being left behind?
- AI projects align with dominant trends in software architecture and infrastructure and operations. AI features can be decomposed into functional primitives and instantiated as microservices—e.g., data cleansing services that profile data and generate statistics, perform deduplication and fuzzy matching, etc.—or function-as-a-service designs.
- Think broadly: AI is used everywhere, not just in R&D and IT. A large share of survey respondents use AI in customer service, marketing, operations, finance, and other domains.
- Train your organization, too—not just your models. Institutional support remains the biggest barrier to AI adoption. If you think AI can help, you should spend time explaining how, why, and what to expect.
- The risks associated with AI implementation are consistent and now better understood. The upshot is that it’s easier to explain to executives and stakeholders what to expect in implementing AI projects.
Clearly, we see AI practices maturing, even if many production use cases appear primitive. Adopters are also taking proactive steps to control for the most common risk factors. Both mature and not-so-mature adopters are experimenting with sophisticated techniques to build their AI products and services. Adopters are using a wide variety of ML and AI tools, but have coalesced around a single language—the ubiquitous, irrepressible Python. However, organizations need to address important data governance and data conditioning to expand and scale their AI practices.