How Big Data is Elevating the Role of the CIO and Transforming the IT Department
A Radical Shift in Focus and Perspective
Information technology (IT) traditionally focused on the processes of managing data, rather than on data itself. The relatively new idea that data itself has intrinsic business value is forcing corporate IT departments to rethink many long-held beliefs about data management. The notion that data has value—that it is, in fact, the new oil—is having a seismic impact on the IT function.
“We’re going to see a lot of changes in the IT industry,” says Harvey Koeppel, a veteran chief information officer (CIO) whose lengthy career includes executive posts at Citigroup and Chemical Bank, now part of JPMorgan Chase. “We’re at the beginning of an inflection point, and we’ve only begun to scratch the surface.” Until very recently, the primary role of IT was to enable business processes. From a technology perspective, that role forced IT to focus almost exclusively on the programs running underneath those business processes.
“That master/slave relationship is drawing to an end,” says Koeppel. The reason for the sea change seems relatively simple on its surface: many companies now perceive that their data has more inherent business value than all the various processes and technologies necessary for managing that data.
“Historically, the IT industry was based on a process paradigm,” says Fred Balboni, who leads the big data and analytics practice in IBM Global Business Services. “You kept your eyes on the process. You were more interested in the pipeline than in what the pipeline was carrying.”
If data truly is the new oil, then what is the proper role of IT in a global economy that is fueled by data? Is IT analogous to the oil industry’s drilling equipment, pipelines, refineries, and gas stations? Or does IT have a strategic role to play in the age of big data? And if IT does indeed have a strategic role, what is the proper role of the CIO?
“IT has focused traditionally on building reports about events that happened in the past. Big data is now shifting the focus of IT. Instead of just looking backward, IT can develop the capabilities for looking forward,” says Clifton Triplett, a West Point graduate who has held senior executive IT posts at Baker Hughes, Motorola, General Motors, Allied Signal, and Entergy Services. “Going forward, IT will be far more instrumental in predicting future opportunities and strategies based on statistical information. As a result, IT will become much more important.” Shifting from a backward-looking to a more forward-looking role will require IT to change its view of data. “IT will have to understand data in a business context,” says Triplett. “IT will have to acquire new skills for managing and understanding data. Today, the average IT person doesn’t have those skills.”
In other words, IT must learn to perceive data in a way that’s similar to the way that the business perceives data. As IT develops more expertise and a deeper understanding of big data analytic techniques, “You will begin to see a stronger integration of IT and the business,” says Triplett.
Ashish Sinha, who leads the Data Warehouse Technologies group at MasterCard, agrees that IT is poised for a significant transformation. In its earliest days, IT was known as Management Information Systems (MIS). Over time, MIS has evolved from a purely reporting function into a fully fledged corporate department that oversees virtually every business process occurring within the modern enterprise.
Despite its enhanced scope and larger budget, IT still exerts relatively little influence on the development of business strategy. Thanks to big data, however, that “influence gap” is on the verge of vanishing. “Big data technology allows companies to harness the power of predictive analytics, extract value from their data, and monetize it,” says Sinha. For the first time in its history, IT has the potential to transform itself from a cost center into a profit center. For the CIO, the upside is clear: When you lead a department that makes money for the company, you get a seat at the table when strategy is discussed. If you are the CIO, big data is your new best friend.
Getting From Here to There
Leveraging big data analytics to transform the IT department from a cost center to a profit center will require new skills and capabilities. From Sinha’s perspective, IT departments should focus on developing or acquiring:
Data cataloging and storage techniques that allow easy access to data by analysts
Big data appliances, databases, data access, and data visualization tools
Capabilities for tapping external data sources
Information security awareness and process expertise required for designing data access procedures that comply with the company’s data privacy policies
“CIOs have to realize that they are responsible for protecting and managing an extremely valuable asset that if used properly can become a huge competitive advantage and if used improperly can lead to a disaster,” says Sinha.
The good news is that many of the skills and capabilities required for managing big data initiatives can be taught to existing staff, acquired by hiring people with analytic training, or “rented” from a rapidly expanding universe of consulting firms specializing in big data. And here’s more good news: getting up to speed on big data does not necessarily require the wholesale abandonment of legacy IT infrastructure. For example, you will not have to decommission your existing data warehouse.
“Some fundamental beliefs around data warehouses will have to change,” says Sinha. “The traditional model requires data in the warehouse to be ‘clean’ and ‘structured.’ We have to get comfortable with the idea that data can—and will—be ‘messy’ and ‘unstructured,’ and that we will have to use external data sources (which have traditionally not been pulled into enterprise data warehouses) in new and innovative ways that will translate the cacophony into a symphony.”
Behind and Beyond the Application
Eben Hewitt is the Chief Technology Officer at Choice Hotels International and the author of Cassandra: The Definitive Guide (O’Reilly, 2010). He believes that big data is driving a major change in the way that business users perceive IT. “In the past, the business saw traditional IT as the infrastructure group, the database administrators, and the application developers. Most users defined IT as the applications on their desktop, which makes sense because those applications are the user interface,” says Hewitt. “But the applications are only the window dressing around the data, and people tend to focus on details like whether the little button on the application is green or blue.”
The reality is that most people don’t really care about the applications. They care about their year-over-year sales numbers, how many prospects they have in the sales pipeline, how much revenue they’re generating—information that’s directly related to their job performance and their earnings. The arrival of big data, with its implicit promise to reveal useful details about customers and their buying behavior, suddenly makes IT a lot more interesting to many more people than it ever was before.
“Big data makes IT much more visible and much more interesting to the average user,” says Hewitt. “Now the conversation isn’t about the application and whether the little button is green or blue—the conversation is about the data. Now IT and the business are speaking the same language. The users are much more comfortable talking about data than they are talking about applications, and so they are more willing to talk to IT and engage in meaningful conversations.”
The consumerization of IT—which is shorthand for saying that everyone with a mobile phone or a tablet has become a programmer to one degree or another—has created a new class of empowered and sophisticated IT users. “There is no such thing as a business user who knows nothing about technology. That person no longer exists,” says Hewitt. “As a result, the conversations between the business and IT have become much more sophisticated and more technical. The business users ask us about the scalability of our servers. They ask us about web log data. They want to know how the data is aggregated. Those are conversations that wouldn’t have happened before big data.”
Whether you think that higher levels of user sophistication are a good thing or a bad thing is largely irrelevant. Today’s users have a better grasp of technology and zero tolerance for applications that don’t work or can’t deliver meaningful results. That’s the new normal, and woe betide the IT executive who doesn’t see that the game has changed.
“You need to start preparing now or you’ll be playing catch-up,” says Jonathan Reichental, the CIO of the City of Palo Alto. “The role of IT is changing fast in many positive ways. We’re not just the guys who buy servers and put them in racks. We’re adding new value by helping the C-suite and the line of service leaders see the invisible, to find hidden patterns and to make better decisions.”
Department directors increasingly rely on IT to provide information that enables them to improve services, increase efficiency, and manage costs. “This is a whole new role for the CIO. In the past, our job was deploying systems, doing the heavy lifting. Now it’s more about making sure that people have the data they need to do their jobs better,” says Reichental.
That’s not to suggest that IT infrastructure is going away anytime soon. Even if it moves off-premise, it still exists somewhere, and that means that the CIO will be responsible for making sure it’s delivering value to the business. Storage and tools aren’t likely to pose major headaches for the CIO, but understanding how all the various parts of the emerging big data infrastructure relate to one another will be important. Below is a simplified diagram of a generic big data analytics stack.A more detailed version of this diagram, originally proposed by David Smith of Revolution Analytics, can be found in a previous O’Reilly white paper, Real-Time Big Data Analytics: Emerging Architecture. The descriptions of each layer originally appeared in that paper.
At the foundation is the data layer. At this level you have structured data in an RDBMS, NoSQL, Hbase, or Impala; unstructured data in Hadoop MapReduce; streaming data from the Web, social media, sensors and operational systems; and limited capabilities for performing descriptive analytics. Tools such as Hive, HBase, Storm, and Spark also sit at this layer.
The analytics layer sits above the data layer. The analytics layer includes a production environment for deploying real-time scoring and dynamic analytics; a development environment for building models; and a local data mart that is updated periodically from the data layer, situated near the analytics engine to improve performance.
On top of the analytics layer is the integration layer. It is the “glue” that holds the end-user applications and analytics engines together, and it usually includes a rules engine or CEP engine and an API for dynamic analytics that “brokers” communication between app developers and data scientists.
The topmost layer is the decision layer. This is where “the rubber meets the road,” and it can include end-user applications such as desktop, mobile, and interactive web apps, as well as business intelligence software. This is the layer that most people “see.” It’s the layer at which business analysts, C-suite executives, and customers interact with the real-time big data analytics system.
Investing in Big Data Infrastructure
One of the major attractions of Hadoop is that it’s an open source platform that runs on relatively inexpensive commodity hardware. That being said, no one is seriously suggesting that the costs of implementing big data solutions are trivial.
While it is true that many big data implementations do not require huge capital investments on the scale required for, say, buying a new ERP system or building a global e-commerce platform, developing the capabilities required for managing big data regularly on a commercial basis isn’t cheap. Companies such as Google, Facebook, Amazon, and Yahoo! have spent hundreds of millions of dollars building big data systems.
“The investment required for handling big data can be massive,” says José Carlos Eiras, the former CIO of General Motors Europe and the author of The Practical CIO (Wiley, 2010). Eiras quickly dispels the myth that big data operations can be simply “outsourced to the cloud,” noting that in Europe and other parts of the world, the use and movement of data is strictly regulated.
The existence of strong regulatory controls, along with growing concern about the potential misuse of big data, means that IT departments will be spending more time and more resources making sure that data is stored safely and securely. Eiras believes that big data will force IT to revert back to its original role of “guardian of all data,” a role that it had largely abandoned after the rise of e-commerce in the late 1990s and first decade of the 21st century. “For the past 20 years, IT has tried to make the business units and the business users responsible for data,” says Eiras. “But a change is already occurring, especially in companies where big data is a source of big money. The responsibility for managing data is shifting back to the IT function.”
That doesn’t mean that CIOs will abandon their efforts to push non-essential IT operations into the cloud. It does mean, however, that the cloud is not some kind of quasi-magical “one size fits all” low-cost panacea for resolving big data challenges. In many respects, it makes sense for IT to reclaim its birthright as the primary manager of corporate data. Another and perhaps less kind way of saying it would be that big data has become too important to leave to the amateurs.
Even if that sounds harsh, it’s worth remembering that the job of the business user is achieving a specific business result and not worrying about the underlying technologies that make it all possible. Ideally, all of the various components and systems work together seamlessly to produce a desired result and should be largely invisible to the business user.
Providing seamless integration and interoperability of multiple tools and systems is clearly the domain of IT. “There is no single architecture that fits everything together,” says David Champagne, chief architect at Revolution Analytics. “You’re going to need a collection of analytics systems and an understanding of how all the various pieces work together. It’s not a black and white situation.” Champagne envisions hybrid architectures of multiple systems running in concert to generate results for users seeking specific types of information. “Not every piece of data is interesting. A lot of it is noise. You will have to figure out what kind of data you have and the kind of analytics you need to run,” he says.
The answers to those types of critical questions will provide the outline of the technology architecture required to handle your big data. It would be unwise, says Champagne, for organizations to skip ahead without answering the critical questions first. “The myth about big data is that you have to use all of it,” he says. “If your problem doesn’t require all the data, why use it?”
Determining which elements of your data sets are genuinely necessary for answering your questions will guide you toward making the right infrastructure investments. Again, this seems to be a task that is best handled by IT. CIOs are accustomed to wading through tidal waves of conflicting requests from business units. And CIOs know better than to blindly trust vendors promising all-encompassing solutions.
Does the CIO Still Matter?
Mike Flowers is a rock star in the world of big data. As chief analytics officer for the City of New York, Flowers has won praise for using big data to track down polluters, identify unsafe housing, speed up permit processing, and generally improve the quality of life in the Big Apple. He attributes part of his success to the team of energetic young data scientists he has assembled and part to his close working relationship with the city’s CIO, Rahul Merchant. “I don’t make the infrastructure decisions,” says Flowers. “I explain what I need to deliver insights for improving operations, and Rahul provides the technology. It’s really a seamless conversation between operations, analytics, and IT. We’re all working together toward a common goal of providing better service across the city.”
The key to the city’s successful use of big data is that “seamless conversation” between critical stakeholders. Take one of them out of the loop, and the model falls apart. “Rahul is very smart and he spent a lot of time in the business sector. He knows that IT can’t deliver value without input from the key players. I would say that big data is definitely a team sport and that IT is absolutely essential,” says Flowers.
From Capex to Opex
Several of the IT executives interviewed for this white paper also suggested that a series of big data initiatives (which would necessarily include new investments in people, processes, and technology) would accelerate the transformation of IT from a slow-moving culture focused on managing its capital expenses (capex) to a more nimble culture focused on managing its operating expenses (opex).
That shift from a capex mentality to an opex mentality might not seem like a big deal. But in addition to forcing a dramatic change from past practices, it would position IT for a more prominent and substantive role in the formulation and execution of corporate strategy. It would also tend to slow the growth of so-called “shadow IT,” a relatively common phenomenon in which business units within the company lose patience with the CIO and begin making deals directly with IT suppliers. Because CIOs have traditionally focused on capex, which tends to be less flexible than opex, CIOs are perceived, rightly or wrongly, as being slow to capitalize on new technologies that would provide the company’s business units with advantages over the competitors in highly time-sensitive markets.
Gregory Fell, the former CIO of Terex Corp. and the author of Decoding the IT Value Problem (Wiley, 2013) puts it succinctly: “For quite a while, IT has been called ‘the office of no.’ Smart CIOs work hard at transforming IT into ‘the office of know.’ When you’re leading ‘the office of know,’ people come to you for help, instead of going around you.” Fell suggests that CIOs focus on understanding the real business value of big data solutions. “Big data enables us to answer questions that we couldn’t answer before,” says Fell, currently the chief strategy officer at Crisply LLC, a data services company. “But first we need to know how much money it costs to answer those questions.”
In the past, there were many questions that IT simply could not answer. “In the age of big data, the challenge is doing the cost-benefit analysis,” says Fell. “In other words, how much are you willing to spend to know the answer to your question?” Fell’s point is well taken, and it points to a maturation of the CIO’s role as a sort of corporate consigliere—someone who is trusted implicitly at the highest levels of the organization.
In any event, a significant expansion in the strategic role of IT would elevate the status of the CIO and would greatly enhance the bargaining power of IT during budget negotiations. So from the perspective of IT, big data would launch a virtuous circle of self-reinforcing benefits.
A More Nimble Mindset
Some experienced IT practitioners see today’s “big data versus traditional data” narrative as a variation on the “agile” versus “waterfall” software development narratives from the previous decade. There is some validity to the comparison, and it’s helpful to look at the similarities between agile methods and big data.
Both agile and big data reflect more of a user- or customer-centric view of the world than their decidedly product- or process-centric predecessors. Agile and big data both tend to push people out of their natural comfort zones and require higher levels of tolerance for ambiguity and uncertainty. Both, one could argue, are messier and less formally structured than what came before them. “IT people who are comfortable working in an agile environment are likely to be more comfortable working with big data than people who worked with a traditional software development lifecycle (SDLC) model,” says Jim Tosone, a former director of the Healthcare Informatics Group at Pfizer Pharmaceuticals. “Agile is about simplicity, speed, and pragmatism. With Agile, you learn to work with what you’ve got.”
Tosone recommends staffing big data initiatives with people who enjoy exploring problems and who don’t feel the need for immediate closure. “A lot of IT people have a very strong desire to find a solution very quickly and then move on to the next problem,” says Tosone. “But their need for closure cuts them off from the kinds of exploration that are often necessary for achieving true understanding of problems.”
Tosone is now a management coach and uses techniques that he learned when he joined an improvisation group earlier in his career at Pfizer. He is also an accomplished classical guitarist, and he sees a distinct connection between music, improv, and big data analytics. “With big data, it’s unlikely that you’re going to know exactly what you need up front,” says Tosone. “So you have to build flexible models and flexible tools. Working with big data requires right-brain and left-brain thinking. You need your whole brain working on the problem. The ideal data scientist is part mathematician and part musician.”
While that idealized description does not match most of today’s IT workforce, it would be a good model to keep in mind when hiring the next generation of IT employees. And even if you are not a strong proponent of agile methodology, the important message is that IT organizations need to become more nimble, more flexible, and more open to change in the age of big data.
Looking to the Future
It appears certain that big data and its associated technologies are destined to become an essential part of the CIO’s portfolio. Sooner or later, all new technologies fall under the purview of IT, and it seems unlikely that big data will be the exception. In the long term, it’s a sure bet that big data will evolve into a multidisciplinary practice, spread across various functional units of the enterprise. It’s entirely plausible that big data will become a standard element of integrated customer strategy and will disappear as a separate or specialized process.
Gary Reiner, the former CIO of General Electric and an operating partner at General Atlantic, foresees a close working relationship between the CIO and a newer type of executive, the chief analytics officer (CAO). “There will be a partnership between the CIO and the CAO. Or in some instances, the CIO will also act as the CAO. It will depend on the needs of the company and on the personality of the CIO,” says Reiner. “The point is that big data is an interdisciplinary effort. I’m a huge believer in cross-functional decision making and cross-functional collaboration. When single functions take sole responsibility for important projects, they are rarely successful. Everyone has to work together.”
At the most fundamental level, big data is likely to create greater demand for IT people who are familiar with data analytics. Demand for skills related to server management and network administration is likely to decline. The need for traditional IT skills won’t vanish entirely, but they will recede in significance as big data moves into the foreground.
While it’s difficult to predict exactly which skills will be in demand two or three years from now, it seems reasonably certain that companies will be looking for people who understand data, understand business, and can write software code.
“If I could say what the next big skill to have is, I’d be taking a class in it right now,” says Sarah Henochowicz, manager of business intelligence at Tumblr. “In my role, knowing SQL and Python are the most valuable skills. I use R a little bit. Honestly, when it gets down to what language you’re programming in, I think that being able to understand the logic is the most important piece. Not necessarily knowing the syntax, but knowing the structure behind it. Once you know a programming language, it’s not that difficult switching to another language. A lot of programs can do the same things, and there are tons of resources online that can help you figure them out. But understanding the logic is what’s really important.”
Now Is the Time to Prepare
As new technologies emerge, the role of the CIO changes. An ongoing challenge for all CIOs is to figure out which new technologies require immediate responses and which can wait. Most CIOs are still in the process of developing strategies for cloud, mobile, and social computing. Security and regulatory compliance are major headaches. Many CIOs spend a significant portion of their time managing a wide range of IT service vendors and outsourcers. CIOs are under pressure to continue cutting costs while adding new services and retraining IT staffers to become more “customer centric.” For most CIOs, big data has not yet risen to the level of an immediate crisis.
That perception could easily change, just as perceptions about the value of e-commerce changed rapidly following the successes of companies such as Amazon and eBay. There is no reason to assume that big data won’t be the “next big thing.”
In any event, CIOs are likely to be questioned by their C-level colleagues about big data, so it would make sense to be prepared. At minimum, CIOs should read about big data, attend conferences, and listen to sales presentations from big data vendors. Smart CIOs will begin hiring deputies who understand the various ways in which big data can help the business. Ideally, CIOs will encourage the hiring of IT staffers who understand database technology, have relevant business experience, and are comfortable writing code.
It seems certain that big data will have its day. For CIOs, the big question is, “Will we be ready when that day arrives?”