To continue learning and to get ahead with your career, check out O’Reilly Learning with a free trial. Live online training, videos, books, certification prep, and more, from O’Reilly and our partner publishers.
We wanted to discover what our readers were doing with cloud, microservices, and other critical infrastructure and operations technologies. So we constructed a survey and ran it earlier this year: from January 9th through January 31st, 2020. All told, we received 1,283 responses.
A lot happened between January and the first week of March, when we got around to analyzing our survey data. It seemed clear to us that the world we’d captured in our survey was going to change (if it hadn’t already)—that some trends would accelerate, that some would decelerate, and that things would never be quite the same. It seems to us that the results of our survey offer a point-in-time snapshot of the latest trends in cloud, microservices, distributed application development, and other emergent areas. Not only do they capture where organizations are, but, more important, they illuminate how they will evolve. We will spend months or even years trying to determine the extent to which we must recalibrate our best-laid plans and assumptions. And as we do so, we will look to surveys like this one as lodestars.
Without further ado, here are the key results:
• At first glance, cloud usage seems overwhelming. More than 88% percent of respondents use cloud in one form or another. Most respondent organizations also expect to grow their usage over the next 12 months.
• A surprising number of respondents—about 25%—said that their companies plan to move all of their applications to a cloud context in the next year. This includes 17% of respondents from large organizations (over 10,000 employees) that have already moved 100% of their applications to the cloud.
• Public cloud dominates, but most organizations use a mix of cloud options; almost half (49%) continue to run applications in traditional, on-premises contexts.
• More than half of respondents use multiple cloud services.
• AWS is far and away the cloud leader, followed by Azure (at more than half of share) and Google Cloud. But most Azure and GCP users also use AWS; the reverse isn’t necessarily true.
• More than half of respondent organizations use microservices.
• More than one-third have adopted site reliability engineering
(SRE); slightly less have developed production AI services. For
this audience, SRE’s future is brighter than AI’s, however.
Software engineers represent the largest cohort, comprising almost 20% of all respondents (see Figure 1). Technical leads and architects (about 11%) are next, followed by software and systems architects (9+%). For the sample as a whole, most respondents (approximately 60%) occupy technical positions. However, a notable minority—some 15%—occupy C-level or executive roles. And about 10% work in technical management positions. That results in roughly 25% share for managers and executives. The “Other” category had a good mix of technical positions (e.g., network engineer, at >2%) and management positions (IT manager, at close to 3%; operations manager at >1%).
A significant minority of respondents (22%) have worked in their roles for more than 10 years; the largest single bloc—almost 34% of all respondents—for between one to three years (Figure 2). There is atypical longevity in the survey audience: almost 55% have worked in their roles for at least four years, and a surprising number (almost 13%), for more than 12 years. It is a more experienced group than we’re used to seeing in our Radar surveys. The maturity of this audience could be a reflection of the maturity of the topic. Still, a solid third of respondents have between one to three years of experience.
Almost one-quarter (23%) of respondents work in the software
industry (Figure 3). Finance and banking (>11%) is the second-largest vertical, followed closely by consulting and professional services (also >11%).
We’re used to seeing these three verticals dominate representation in our Radar surveys. At 23%, however, the software vertical is significantly overrepresented, at least relative to prior surveys. Consulting and professional services, by contrast, could be slightly underrepresented. This may be a source of bias. We imagine that companies in the software industry are more likely to be early (or mid-stage) adopters of technologies like cloud computing.
There’s a good mix between large and small firms. About half of all respondents work in organizations with fewer than 1,000 employees. More than a quarter work with very large organizations—i.e., 10,000 or more employees. And about 28% work for small outfits of between one and 100 people.
About two-thirds of respondents work in North America. The next largest region, Asia, is home to about 15% of all respondents. Europe, which in most Radar surveys constitutes the second-largest respondent bloc, was third, accounting for just 11% of all participants. This survey’s disproportionate tilt toward North America is unusual. In our AI adoption in the enterprise survey, for example, we had close to a 50/50 split between North America and the rest of the world.1 Regional representation in Radar surveys typically tracks with usage on the O’Reilly learning platform. North American users account for about half of activity on the O’Reilly platform. That isn’t the case here. Again, this is a source of bias: companies in some European countries are much more hesitant about moving workloads to the cloud.
Almost Completely Cloud-y
Slightly more than 88% of respondent organizations use cloud computing. Just 10% of respondents say they don’t use cloud computing at all, however. If this seems anomalously high, it shouldn’t.
A strict definition of “cloud” must also include software-as-a-service (SaaS) and platform-as-a-service (PaaS) offerings of all kinds—including email (Google G Suite email; Microsoft Exchange Online), office productivity suites (Google Docs and Sheets; Microsoft Office 365), and similar offerings. Designing a survey inevitably entails making a spate of methodological trade-offs. We could have specified a narrow definition of cloud—inclusive of the SaaS, PaaS, and infrastructure-as-a-service (IaaS) cloud; exclusive of cloud-based email, office productivity, etc.—but the fact remains that a proportion of enterprises either outsource their email hosting to Google, Microsoft, and other providers or subscribe to cloud office productivity services that (in most cases) bundle email hosting, too. These services are also designed to function as gateway drugs to cloud services: e.g., Microsoft integrates its on- and off-premises Excel client experience with its PowerBI cloud analytics service, as well as with its ecosystem of Azure-based advanced analytics and machine learning (ML) services.
This brings up another, related issue: how much visibility do survey respondents actually have with respect to how and where cloud gets used in their organizations? It’s likely that most respondents lack complete visibility into and across their organizations; in a large enterprise, for example, few if any people have this kind of panoptic view. When we took all of these considerations into account, a more-inclusive frame for cloud adoption made the most sense to us. It encompasses private clouds, the IaaS cloud—also host to virtual private clouds (VPC)—and the PaaS and SaaS clouds. It is less concerned with formal definitions2 and captures the point-in-time totality of cloud adoption.
Among non-adopters, culture seems to be the biggest impediment to cloud adoption: just under 5% of non-adopters cited an “organizational preference to keep data on premises” (Figure 4). More than 2% cited regulatory concerns as a bulwark to adoption, while a still larger proportion—close to 3%—cited risk, especially with respect to migrating on-premises workloads, services, or data to cloud. Oddly, about 3% of non-adopters cited cost as a primary reason not to move workloads to cloud; cost-efficiency is usually touted as one of cloud’s most attractive features.
Also of interest: close to 2% of non-adopters cited the prospect of
vendor lock-in as a rationale for not using cloud. All told, 22% of
respondents selected at least three issues; 46% chose at least two issues. Again, non-adopters comprise just 10% of the survey audience.
Cloud Usage Waxing, not Waning
Most (90%+) respondent organizations expect to increase their usage of cloud-based infrastructure. This result aligns very closely with the proportion of respondents (88%+) who have already adopted cloud. The upshot is that the overwhelming majority of adopters plan to grow, rather than reduce, their cloud usage share. Oddly, most growth seems to be happening at the extremes: almost one quarter of respondent organizations expect to move all of their applications to the cloud in the next 12 months (Figure 5). This was the second biggest cluster, overall. The largest cluster—at just under 34%—consists of respondents who expect to move one-quarter of their applications to the cloud in the next 12 months.
About 45% of respondent organizations expect to move three-quarters or more of their applications to cloud during this same period; 67% expect to shift half or more of their applications during that same period. Zoom out to 36 months, and close to 40% of all respondents expect that all of their applications will run in a cloud context—and about 63% anticipate running at least three-quarters of their applications in the cloud.3
Taken at face value, these results suggest almost irresistible momentum in favor of cloud. Keep in mind, however, that usage share is based on the applications respondents know of, and that few if any respondents have a complete view of deployments across the whole of their organizations. With this caveat in mind, the results nonetheless suggest a wider embrace of cloud infrastructure and support the idea that most organizations now equate cloud with what’s next for their infrastructure decisions.
Public Cloud Dominates, but Most Organizations Opt to Mix Things Up
Among cloud adopters, more than 21% host all of their applications in a cloud context of one kind or another. However, organizations that host one-quarter or fewer of their applications in the cloud comprise the largest single cluster, at 39% of all respondents. As might be expected, small companies and startups are likely to host substantial proportions—in some cases, all—of their applications in the cloud. There were some surprises, however. For example, about 17% of companies with 10,000 or more employees host 100% of their applications in a cloud context of some kind (Figure 6). This number balloons to about 37% of companies with between one and 100 employees. Just under 50% of companies with 10,000 or more employees host 25% or fewer of their applications in a cloud context.
Public cloud is the most popular overall deployment option, with a usage share greater than 61%. Traditional, on-premises deployment—at just under half (49%) of usage share—is second. Hybrid cloud, which combines public cloud services with on-premises private cloud infrastructure, is third, with approximately 39% usage.
The survey encouraged respondents to make multiple selections from among five cloud deployment options. Nearly one-tenth (9%) selected all five, and almost one-fifth (19%) selected four out of five. Almost two-thirds (64%) selected at least two cloud deployment options. The upshot is that—even though the public cloud is by far the most popular option—most respondent organizations employ a mix of cloud types. Interestingly, multi-cloud, or the use of multiple cloud computing and storage services in a single homogeneous network architecture, had the fewest users (24% of the respondents).
However, more than half of respondents (54%) also use multiple cloud services. The poor showing for multi-cloud might be the difference between tactical/ad hoc and strategic usage. In other words, comparatively few respondent organizations appear to be pursuing dedicated multi-cloud strategies.
Amazon and AWS Ascendant
Not surprisingly, Amazon Web Services (AWS) is far out ahead of the rest of the pack: it’s used by more than two-thirds (~67%) of all respondents. However, close to half (~48%) use Microsoft Azure, and close to one-third (~32%) use Google Cloud Platform (GCP). Respondents were encouraged to select multiple cloud service providers; in fact, a slight majority of respondents—54%—use more than a single provider. Among cloud providers, Amazon, Microsoft, and Google dominate their rivals, with Alibaba Cloud, IBM Cloud, and Oracle Cloud garnering just under 12% of share. (The poor showing for Alibaba Cloud could be a function of the larger-thannormal North American bias in the audience, as could the representations of both IBM and Oracle, which are less.)
Among respondents who use only public cloud providers, AWS’ share was even larger: it accounted for 75% of usage, compared with 52% for Azure and 34% for GCP. In fact, AWS is clearly the backstop vendor: not only does it have the highest share among respondent organizations, but—of the 54% who use at least two cloud vendors—almost all of them (93%) list AWS as one of those vendors.
If Microsoft and Google really are coming on strong, they aren’t dislodging Amazon and AWS. If anything, organizations seem to be pursuing multi-cloud strategies—even if they aren’t explicitly “doing” multi-cloud. Among our survey respondents, multi-cloud effectively means AWS + another cloud service.
Microservices Achieves Critical Mass, SRE Surging
More than half (52%) of respondent organizations say they use microservices concepts, tools, or methods for software development. Of these, a large minority—just over 28%—have been using microservices for more than three years. This was the second-largest cluster among users of microservices. The largest, at more than 55%, has been using microservices for between one and three years. Just 17% of users are new to microservices, with less than one year of adoption and use.
A few caveats are in order. First, our survey didn’t ask respondents if they (or their organizations) have adopted microservices architecture. There’s a world of difference between experimentation and/or ad hoc usage and adoption; we saw this with agile, and—as we note below—we’re likely seeing it with SRE, too. Just because a development team uses the tools, concepts, and methods of microservices architecture doesn’t mean it has adopted microservices architecture. It may be that microservices patterns, as distinct to conventional software development, are well suited for the particular use case, as with video encoding, which entails multiple parallel or concurrent CPU- or GPU-intensive workloads.
Second, there is some evidence that interest in microservices might be at or close to peaking. There is also evidence that decomposition—at least to the degree of granularity prescribed in microservices architecture—is proving to be more difficult than anticipated. Finally, there’s the Perrow-ian critique of microservices architecture, which argues that its complexity constitutes a kind of de facto tight coupling that makes it impossible to anticipate potential edge cases and eliminate risk.
Almost 35% of respondent organizations have implemented a Site Reliability Engineering (SRE) function. Even though SRE is less well known than microservices, DevOps, and other topics, it isn’t in any sense new. At this point, interest in SRE actually tracks closely with interest in microservices itself.
Close to half of all organizations (47%) in our survey say they expect to implement an SRE function at some point in the future. Should this pan out, SRE adoption share would be roughly comparable to that of microservices. Is there significant overlap between the two, however? In other words, if an organization adopts microservices-oriented concepts, tools, and methods will it also tend to adopt an SRE function? Or is the growth in SRE related to other factors, such as (for example) declining interest in DevOps itself? In our analysis of user activity on the O’Reilly learning platform, we found that DevOps-related search and usage declined in both 2018 and 2019. We posited that adopters “might be having trouble scaling DevOps” because “developers tend to be less committed to DevOps’ operations component.”
We’d be remiss if we didn’t note that the strong showing for SRE is almost certainly a function of selection bias in our audience—i.e., our respondents are more likely to be using SRE than not. SRE’s performance could also be a function of the same cargo cult phenomenon we saw during the agile revolution, when familiarity with the term and uptake of select ideas or methods was conflated with adoption. As for the declining interest in DevOps we recorded in our platform survey, it’s just as possible that this decline—measured in terms of topic usage and search activity on the O’Reilly learning platform—is actually a function of something else: namely, the maturation of the DevOps topic. Clearly, the DevOps practices that took root over the last decade aren’t going anywhere. Instead, it’s likely that IT professionals are exploring and learning about DevOps-adjacent disciplines (such as SRE) that are new to them.
As we noted in our platform analysis, in this and similar cases, it’s helpful to view the problem of user interest through the lens of the so-called Overton Window, which circumscribes the human cognitive bandwidth that’s available in a certain place at a certain time. Obviously, no combination of issues or trends can exceed more than 100% of available bandwidth. The upshot is that declining interest in a topic doesn’t have to correlate with a decline in use (or usefulness) in practice. Or vice-versa. In the case of decline, a mix of emergent trends might be crowding out a topic. In the case of ascendancy, a trend might be (ephemerally) emergent.
We didn’t attempt to define serverless precisely, but for many people in our audience, serverless means “function-as-a-service” (for example, AWS Lambda). Services like AWS S3 are very much “serverless,” but that’s not common usage. With that in mind, one-third (almost 34%, in fact) of respondent organizations say they’re using serverless computing.
This is roughly on par with the percentage that says they’re using SRE. Unlike with SRE, where almost half (47%) of respondents expect to add an SRE function at some point in the future, fewer (approximately 37%) expect to adopt serverless.4 By the same margins—i.e., 37% pro-experimentation, 63% anti—fewer respondent organizations have “experimented” with serverless computing, e.g., by evaluating vendors, scoping serverless scenarios, or testing serverless on a limited basis.
What’s interesting is that all three topics—viz., microservices, SRE, and serverless—seem to track closely with one another. Is there a meaningful correlation here, or is this consonance spurious? Clearly, microservices are not a new thing—but neither is SRE. Is it possible that the complexity of microservice architecture, serverless computing, service mesh architecture, and other next-generation patterns is contributing to (if not driving) interest in SRE? We don’t have the data to begin to answer this question.
But it’s one we’d plan to keep an eye on.
Critical Skills for Success
Which skills are most important for migrating or implementing cloud-based infrastructure? Expertise in containers, Kubernetes, and monitoring all scored highly, but the number one skill area was cloud-based security. (The survey design encouraged respondents to select from among multiple listed skills.)
Almost two-thirds of respondents (65%) selected cloud security, with monitoring (58%) a distant number two. General cloud knowledge was third (just over 56%), followed by containers and Kubernetes (just under 56%), respectively. All told, six separate skills polled at 50% or greater; 10 listed skills polled at 45% or greater. Clearly, respondents believe that they—along with other infrastructure and ops practitioners—need to skill up, with emphasis on security. Almost half (48%) of respondents selected six or more listed skills; 85% selected at least three listed skills. And 15% selected all 10 listed skills.
We looked at the intersection of skills to see if respondents had selected specific combinations of skills more frequently than might otherwise be expected. We discovered obvious examples of correlation (i.e., a threshold at least 5% higher than expected) with containers and Kubernetes; containers and microservices; monitoring and observability; and between security and compliance. We found several examples of correlation between cloud-based security and other listed skills, which reinforces the idea that security dominates the thinking of infrastructure and ops practitioners; we found correlations involving security and monitoring; security and performance; and between security and observability.
Finally, respondents selected some skill combinations less frequently than would be expected.5 Some of these results are baffling, such as the absence of a correlation between microservices and security. Some examples of strong correlation (microservices and Kubernetes; containers and microservices) are consistent with trends we’ve described elsewhere, e.g., the Next Architecture.
AI in Production, Poised for Growth
Almost 36% of respondent organizations have deployed AI services. About 47% expect to deploy AI-based services at some point over the next three years; of these, the largest cohort (almost 20%) expects to do so in the next two years. Still, close to 53% do not anticipate doing anything with AI.
The discrepancy isn’t surprising. A survey on infrastructure and operations will tend to attract people who are interested in infrastructure and operations. Ditto for ML and AI. Every survey has a self-selection bias.
Still, the result seems anomalous. In the first case, it flies in the face of predominant trends. In our recent machine learning (ML) and AI adoption survey, for example, we found that most organizations—about 53%—are using AI in production today. Even granting that AI is (over-)hyped, we should expect to see a majority result for planned AI adoption, shouldn’t we? In the second case, there are very good reasons why AI should be of interest to IT professionals who work in infrastructure and operations and (more important) the companies that employ them. Take observability, for example. It’s an important concept in software architecture, especially in next-generation regimes, such as microservice architecture. Machine learning and similar advanced techniques (e.g., deep learning) will likely play an important role in observing the observable systems that we build, just as AI-directed rules and AI-driven automation will be critical for managing and securing these systems.
How, then, can an organization expect to manage the thousands or tens of thousands of services that comprise an observable system without building AI services? One explanation is that respondents simply lack visibility into this aspect of their organization’s planning. In other words, because AI-related development is owned by one or more different groups—data scientists, ML and AI engineers, Data‐Ops practitioners—many respondents genuinely aren’t aware of what their organizations are doing. An equally likely explanation is that respondents are failing to appreciate what actually constitutes AI. As we noted in another context, “AI” used to be identified with so-called artificial general intelligence, or AGI. Increasingly, however, we’re seeing it used to describe the application of machine learning to solve problems, increase productivity, accelerate processes, and in many cases deliver wholly new products and services. Almost any consumer-facing site that makes product recommendations is using AI (although possibly in a very simple form.) It’s possible that some proportion of respondents had AGI in mind. Had we asked more specific questions, we likely would have gotten different results.
The survey was conceived and conducted in the months prior to the tumult of March and April. It is a product of a pre-pandemic sensibility.
The impact of a pandemic event isn’t just disruptive, it’s transformative: it fundamentally changes the status quo; it compels the revaluation of virtually all assumptions. This invites the obvious question: Were we to conceive this survey today, what would we do differently? Obviously, we’d ask questions that take into account the realities—e.g., an unprecedented emphasis on social isolation; a new (and mostly unprecedented) acceptance of telework, geographical separation, and distance(-ing); a business climate characterized by extreme uncertainty, with most analysts forecasting severe recession, if not possible depression—that serve as backdrop to this, our moment.
The most challenging thing about what’s happening is that it’s very much happening: we’re still coming to terms with it. Changes, compromises, reconfigurations that we never thought possible could become de rigueur. Other major changes will unfold over much larger periods of time.
It’s naïve to think we could anticipate even the broad strokes of these changes, let alone the specificity of their content. And it’s unhelpful—perhaps even dangerous—to overthink things. That said, it is useful to speculate about what could change in the near term, at least now that we have precedent for the unprecedented. It’s possible that the public cloud could become an even more attractive option for companies of all sizes. It’s possible that hybrid clouds combining PaaS or IaaS services with the virtual private cloud—that is, private cloud deployments which live in the public cloud—could see increased uptake, too. It’s also possible that more organizations will pursue multi-cloud as a strategy to hedge against potential disruption.
The hope of reducing costs won’t be the only thing driving new interest in cloud. Almost all enterprises are already dealing with staffing problems on several fronts: first, illness-related staffing shortages; second, staffing shortages that stem from shelter-in-place orders at the municipal, county, or state levels; third, furlough- or layoff-related staffing shortages. In some cases, IT workers have opted to withdraw from the workforce, e.g., to safeguard the health and well-being of their families. The combination of these and other staffing-related issues could compel companies to revisit not only the necessity for on-site/on-premises work, but the responsibility for hiring, recruiting, and managing IT staff to support applications or services that could shift to a third-party provider. Is it more credible for a large cloud service like Amazon, Google, IBM, or Microsoft to argue that its employees are essential than—for example—the IT staff of a major cosmetics retailer? More to the point, are the major cloud providers more likely to keep their datacenters running in the face of quarantines than a business with a private datacenter? The answer to the latter question has to be “yes.”
These are just a few possible changes. Had we the opportunity to redo our survey, we would almost certainly ask questions that drill down deeper into these issues. Nevertheless, we believe the results we capture here have considerable merit: not as quaint relics of a prelapsarian past, but as valid indications of where we were and where we’ll be when things pick back up again. There’s no reason to assume that the underlying trends here will be annulled by the effects of COVID-19. Impacted, yes; annulled, no. The shift to cloud, uptake of microservices, increasing interest in SRE, emphasis on Kubernetes, container virtualization, and other critical skills: each of these trends has staying power, especially to the degree that they’re implicated in or correlated to one another.
1 Regional representation in Radar surveys typically tracks with usage on the O’Reilly learning platform. North American users account for about half of activity on the O’Reilly platform. That isn’t the case here.
2 For example: What criteria do we use to distinguish between private and public cloud? How do these criteria—and the distinction itself—relate to hybrid cloud? To virtual private cloud? What if an organization hosts its cloud in a colocation facility? Is it public or private? Is tenancy—e.g., a single-tenant cloud hosted in an off-site facility is a private cloud—the most important criterion? If an organization uses a combination of virtualization and automation to host some of its workloads, has it created a private cloud?
3 But which cloud? Or in which cloud context? The essential characteristics of modern software architecture—loose coupling; abstraction, isolation, and atomicity—are eliding the boundaries between what we think of as “cloud” versus “on-premises” contexts. As we wrote in our analysis of search and usage on the O’Reilly learning platform: “Specific deployment contexts will still matter, of course … but the clear boundaries that used to demarcate the public cloud from the private cloud from conventional on-premises systems will fall away. It’s all cloud-like, irrespective of context.”
4 This looks like another case in which interest in a technology—namely, serverless—also tracks with interest in other, not necessarily related technologies, in this case, microservices and SRE. Even if serverless adoption lags, interest in it seems to wax and wane with (and perhaps benefit from) interest in these other technologies.
5 These include general cloud knowledge + security; general cloud knowledge + performance; microservices + security; compliance + monitoring; compliance + performance. All with a threshold 5% lower than expected.