From Insight-as-a-Service to insightful applications
Applications that combine machine learning, AI, and domain knowledge have strong potential for industry and investors.
Applications that combine machine learning, AI, and domain knowledge have strong potential for industry and investors.
We started this series with the premise that insightful applications, essentially the next-generation of big data applications, are the key to effectively addressing important problems, such as autonomous driving. In the first post, we examined how data analytics has evolved over the past 25 years. In the second post, we established how the early success with big data infrastructures has given rise to applications, and we defined a taxonomy of such applications. In this third post, we will discuss insightful applications in more detail and conclude by outlining the future for such applications, along with the sector’s investment potential.
A discussion on insightful applications must begin by first defining what is an insight. An insight is a novel, interesting, plausible, and understandable relation, or set of associated relations, that is selected from a larger set of relations derived from a data set. The selected relation, or relations, must lead to the formation of an action plan. This action plan, when applied, results in a change that can be measured through a set of Key Performance Indicators (KPIs). The ability to measure KPIs enables such systems to learn and improve over time. Insights and their associated action plans are generated by complex systems, which were described in an earlier post.
Insightful applications combine big data management with an insight generation system, which incorporates a variety of machine intelligence techniques, (e.g., machine learning, planning and reasoning, natural language processing, vision, and other AI techniques) and appropriate domain knowledge for those techniques to operate effectively. These applications must not only be able to generate an action plan from insights, but they must be able to approximate the necessary resources (e.g., time, money) that will be consumed in the process of executing that plan.
A good example of an insightful application is found in autonomous cars, such as Google’s self-driving car. In order for an autonomous car to function properly, its software must be able to deal with very large volumes of diverse data that are generated in real time from sensors like lidar; make a variety of predictions using numerous sophisticated models; select the most interesting and plausible of these predictions (e.g., within 50 yards, a bicyclist riding ahead of the car will likely enter the car’s lane); and generate an action plan based on these insights (e.g., slow the car by applying the brakes) while constantly evaluating the results of these actions (e.g., make sure the car does not hit the bicyclist). Based on the data that is collected from this incident and the evaluation of the plan that was executed, there is an opportunity for the insightful application to learn.
Because of their complexity, insight generation systems—and therefore insightful applications—have been hard and expensive to develop, and few insightful applications have been introduced to date. At best, corporations have been utilizing what we have termed “Insight-as-a-Service.” As shown in Figure 1, insights and their associated action plans are generated periodically as part of a service offered by specialized personnel, called “connectors,” in collaboration with data scientists.
At a high level, a connector works with the business user to understand the problem to which insights and actions will be applied. In addition to defining the business problem, the connector gains an understanding of the data that is available and can be used to address the problem. The business problem definition combined with the data description is then “translated” to a data problem, which can be addressed by data scientists. Data scientists, with the help of data stewards, work on the available data to extract patterns and other relations. The connector evaluates the extracted patterns and relations to identify those that constitute insights. The connector then proceeds to associate one or more appropriate action plans with each identified insight, and presents the resulting pairs to the business user. After applying the prescribed actions, the business user communicates to the connector the results of these actions, along with the effectiveness of the actions to address the problem being solved (as determined through the use of a previously defined set of KPIs).
Because of their role in providing Insight-as-a-Service, connectors must have strong knowledge of the processes, technologies, and issues in the industries they work. They must be able to “translate” business problems into data problems, and effectively communicate this translation to the data scientists. Lastly, connectors must be able to fuse the relations extracted by the data scientists and their own domain knowledge to generate insights and action plans, which will be communicated to business users.
Today, Insight-as-a-Service is offered primarily by management consulting firms like IBM, PwC, McKinsey, and Accenture. Because of the skills required and the critical role they play in the insight generation process, connectors tend to be some of the most expensive resources a management consulting firm employs. There are too few such individuals in management consulting firms today. This means that too few corporations are generating insights from their data, focusing instead on areas with potential for a high return on investment (ROI) such as cybersecurity, supply chain optimization, and marketing budget optimization.
Consider the process a cable company might go through in order to optimize its marketing budget as it tries to determine what percent of its budget to allocate toward reducing customer churn. The marketing department must create a set of actions that are within the limits of the marketing budget. Using his specialized knowledge, the connector must understand the nature of the churn problem and the data that is available to address this problem. He may determine that, as a first step, the company’s subscribers should be scored in terms of their probability to abandon their service, and maybe even their probability to upgrade to a higher level of service. He must then present this as a scoring problem to the data scientist, along with the available data. The data scientist can then proceed to develop the appropriate scoring model(s).
Once the models are developed and the scores have been established, the connector uses his domain knowledge to do the following:
There is increasing interest in developing, and funding, insightful applications. We’re seeing this increased interest because we have a greater understanding about what is involved in providing Insight-as-a-Service and the strong ROI that such solutions already provide. We’re also aware of the early successes with other types of big data applications, as described in the second post of this series, and the ever-increasing availability of big data. Finally, we acknowledge recent advances in the technologies that are used by insight generation systems and, more importantly, the strong demand for more automated, accurate, and faster decision-making.
Areas where we are seeing great opportunity for insightful applications, as well as entrants starting to offer such applications, include:
These areas share the following common characteristics, which make them particularly relevant as insightful applications:
In general, my thesis is that through insightful applications and a fixed amount of accuracy, we can use lower-cost resources, including less computation, because of the availability of more data.
An interesting early example of an insightful application is IBM’s Watson Oncology Assistant, which IBM is developing in collaboration with various medical centers such as the MD Anderson Cancer Center and the Memorial Sloan Kettering Cancer Center. This insightful application optimizes the decision-making process that determines which patient therapy to recommend based on a) patient genetic data that is generated from DNA sequencing, b) MRI data, c) other documents describing the patient’s health history, and d) scientific publications. For example, while DNA sequencing enables us to identify mutations linked to cancers, the data that is generated in the process is voluminous. Many physicians cannot easily make use of all available data because of its volume and variety, and often because they may not be aware of publications describing relevant research results. The Watson Oncology Assistant will be able to address all these problems and make the appropriate therapy recommendations to the attending physician.
To achieve their objectives, insightful applications combine big data management with artificial intelligence concepts and systems. In particular, insightful applications include:
The complexity and cost of developing insightful applications has been decreasing significantly and this trend is expected to continue for the foreseeable future. This is because artificial intelligence systems having become more readily available and better understood, particularly with the release of open source packages from Google, Facebook, Microsoft, and other companies. Additionally, cheaper storage, abundant and cheap processing power and networking bandwidth, and cloud-enabled separation of storage and computing have helped drive the development of insightful applications.
Despite our improving understanding of insight generation, the technology advances we have made, and the growing number of insightful applications currently under development, we are not completely out of the woods. There are four major areas in particular where we need to make progress:
For the reasons we have previously described, while insightful applications present an exciting investment opportunity, they are complex and difficult to develop. As a result, we don’t foresee the development of such applications leading to the complete elimination of connectors and data scientists.
In the second post of this series, we noted that over the past few years, venture investors have been investing in three types of big data applications: shallow applications that use general-purpose analytic tools, applications that process big data but that do not use predictive or prescriptive analytics, and applications that use embedded predictive analytics. More recently, as they have recognized the importance and economic value of successful insightful applications, a few venture investors are starting to invest in startups that develop insightful applications. Because I have come to recognize how critical it will be for corporations to utilize big data through insightful applications in their effort to innovate, particularly using my startup-driven innovation methodology, I am focusing my new venture fund on startups that develop such applications.
We anticipate that insightful applications will be developed over a few different generations with the ultimate goal of the application completing 70% of the process and humans—including data scientists, connectors, and business users—completing the remaining 30%. Today’s first generation insightful applications are able to assist connectors and data scientists. The next generation’s applications will be better able to understand situations automatically. IBM, for example, is in the process of equipping Watson with sophisticated natural language processing technologies that can automatically understand and encode domain knowledge from a variety of sources, such as journal articles as well as spoken problem descriptions. The generation after that will be able to make decisions more autonomously, matching libraries of insights and action plans to descriptions of new problems. The final generation of insightful applications will be able to discover new insights and action plans on their own, with limited guidance from expert users.
Insightful applications are the key to effectively providing big data-driven solutions to many important problems while simultaneously controlling the costs of such solutions and dealing with the shortage of the necessary specialized personnel. Because of their complexity, the development of such applications will be neither simple nor quick. Patience will be necessary, as we anticipate that the promise of insightful applications will be realized in several generations of increasingly sophisticated and increasingly automated applications. Recognizing the opportunity afforded by such applications, a few corporations and venture investors have started aggressively investing in their development—the initial results are already impressive and fill us with excitement about what will be possible in the near future.