As fast as hardware is improving, it still cannot always meet user expectations. Computer applications are growing more intelligent and easier to use. However, because of this, they also require more resources to run. Additionally, computer applications are being deployed to more and more people living in every part of the world. This inevitably leads to problems of scale: applications may run all right in development, but run in fits and starts when you start to deploy them on a widespread basis.
Problems of scalability are some of the most difficult that application developers face. There are three facets to scalability:
The problems computers must solve get characteristically more difficult to compute as the number of elements involved grows.
People often have unrealistically high expectations about what computer technology can actually do.
There are sometimes absolute physical constraints on performance that cannot be avoided. This applies to long-distance telecommunications as much as it does to signal propagation inside a chip or across a communications bus.
We discuss these three factors in the following sections.
There are “hard” problems for computers that have a way of getting increasingly more complicated as the problem space itself grows larger. A classic example is the traveling salesman problem, where the computer calculates the most efficient route through multiple cities. This can be accomplished by brute force with a small number of cities, for example, by generating all the possible routes. As the number of cities increases, however, the number of possible routes the computer must generate and examine increases exponentially. In the mathematics of computing, these problems are known as NP-complete. With many NP-complete problems, as the problem space grows more complicated, the computational effort using known methods explodes far beyond what even the fastest computers can do.
An application such as a search engine for the Internet provides a good example of something that works fine with a certain amount of data but collapses under the impact of more and more data points. No doubt, at one point in time, search engines seemed like a wonderful way to navigate the World Wide Web. Today, searching for web pages related to “Computer Performance” on AltaVista, we found 3000+ matches. Meanwhile, on Northern Lights we found 13,000+ matches. On Lycos we found matches with almost 9,000 web sites. Frankly, we are not sure whether it is better to find 3000 matches or 13,000; we certainly do not have time to look through thousands of search engine entries. Clearly, search engine technology has trouble dealing with the current scale of the Internet.
Commercial data processing workloads encounter similar problems of scalability. For example, it is well known that sorting algorithms, a staple of commercial workloads, scale exponentially. Building very large database applications that can scale to high volumes of activity and large numbers of users is also challenging, but in a different way. The performance issues that affect scalability in database and transaction processing workloads typically involve serialization and locking. When one transaction updates the database, the portions of the database affected by the update are locked. Other transactions that attempt to access the same portion of the database are delayed until the update completes. Updates can cause powerful parallel processing modes to revert back to much slower serial ones.
The implications of serialization and locking in the database world are very complicated. One update may trigger further activity when the indexes that speed access to the data themselves require updating. To ensure data integrity, certain update transactions may be forced to hold locks for long periods of time until several different portions of the database are updated. Database locking and serialization problems can typically be addressed only at the design stage. As an aside, when programmers attempt to use rapid application development (RAD) and other prototyping tools to speed the development process, transaction processing applications are often prone to these problems of scale. Using these development tools, the application is described in a logical fashion and then generated without regard to performance considerations. The application will demo well on a low volume of data, but user expectations are later shattered when the prototype does not scale beyond a small number of users, and the entire project then has to be rewritten by hand in C++.
The second aspect of the current application development environment that leads to performance problems is that expectations about what computer technology can and cannot do are often far removed from reality. It does not help that these inflated expectations are fueled by vendor claims of breakthroughs in applications such as handwriting analysis, speech recognition, and natural language processing. These are all activities that require years for humans to master, and none are easily translated into the logical, symbolic processing that computers excel at. No computer program is yet able to comprehend a simple bedtime story that we might read to our young children (let alone interpret the pictures). Even we may not completely understand some of these stories! A good tale’s purposeful ambiguity certainly cannot be reduced to simple logic that a computer can process and understand.
But because they can perform so many remarkable things—keeping track of the family finances, playing a good hand of bridge, or simulating an alternative universe—it is difficult for many people to accept that computers have limitations. Moreover, the pace of application development continues to be excruciatingly slow, frustrating the best intentions of the people who make software development tools. Consequently, the demand for new applications far exceeds the rate at which developers can supply them. In highly charged, competitive business situations where people are desperate for relief, it is not surprising that mistakes in judgment occur. Decision-makers are easily tempted to believe that there is a solution to their problems when, in fact, no technical solution actually exists.
Sometimes the constraints on what computers can do involve simple physical limitations. There are molecular limitations on how small the circuits on a silicon wafer can be fabricated or how close the magnetized bits on a rotating disk platter can be stored. An electromagnetic or optical signal carried through a wire travels at anywhere from 0.50 to 0.75 times the speed of light. Inside a computer it takes a minimum of 0.100 microseconds for the processor to signal a memory board located a mere 10 feet away. Signal propagation delay is one of the important reasons why processing speeds improve as semiconductor density increases.
Physical constraints limit the speed of long-distance communications, which affects networking applications that must traverse long distances. A 3000-mile data transmission over a wire requires about 25 milliseconds. But signals attenuate over long distances, and they are seldom transmitted from point to point. Every so often you need a repeater to process the signal, amplify it, and send it on, which means additional delays. If your signal is being routed over the public Internet backbone network, it can be routed from New York to Los Angeles via Atlanta, Denver, and San Francisco, easily traversing two or three times the distance of a direct point-to-point link. If you are sending a message via satellite, you must factor in the distance the signal must travel through outer space and back. Transmission delays over long distances simply cannot be avoided because the speed of light functions as an absolute upper bound on performance.
Performance considerations are important in desktop systems, distributed and networked systems, and client/server applications today. Performance is one of the core system management disciplines, along with operations, storage management, change management, etc., and performance considerations play an important role in many system management decisions. One reason cost/performance considerations are important is that they can be quantified, unlike fuzzier concepts like ease of use. Because you can put a price tag on a system configured to achieve a particular level of performance, it is possible to do a cost/benefit analysis, which is a genuine aid to decision-making.
Consider an organization’s decision to upgrade 1000 desktop PCs running Windows 95 to either Windows ME or 2000. Why not benchmark some of the proposed configurations running the applications that are normally in use? Comparing Windows ME to Windows 2000 Professional, you find that Windows ME outperforms Windows 2000 with 32 MB and 64 MB configurations. But at 128 MB and higher configurations, Windows 2000 Professional provides better performance. These results allow you to quantify the cost of moving to either alternative. While the cost of adding memory and, perhaps, disk space to a single machine may not amount to much, multiply by a factor of 1000 and the decision to upgrade involves six or seven figures. Clearly, performance is not the only criterion factored into the final decision, and it might not even be the most important factor. For instance, if Windows 2000 is not compatible with a critical piece of software, that might be enough to sway your decision towards Windows ME. On the other hand, there is a no-nonsense, bottom-line orientation in performance analysis that appeals to many executive decision-makers.
The horror stories of failed development projects that did not meet cost and performance specifications reflect the fact that expectations about what computer technology can do far exceed the reality. Large-scale application development is very demanding. In our judgment, even as hardware performance continues to improve over the next 10-20 years, development will not get easier. In the future, the computer industry will continue to be challenged to provide cost-effective solutions to even more difficult problems. For example, we expect that multimedia applications with voice, high-resolution graphics, and animation will continue to be plagued by performance issues in the decade ahead.
To take an example, we were recently asked to help size a medical imaging application for a large, state-wide health care organization. The size of these high-resolution imaging files ranged from 100-300 MB each. Physicians wanted to be able to access image files and discuss them with other specialists located in different parts of the statewide association. For each patient’s case, the physicians might need to review multiple images before they could make a competent diagnosis. They wanted to be able to view the image files concurrently and make notes and annotate them interactively. Being very busy, they wished to be able to perform all these tasks in real time at locations convenient to them. Naturally, the budget for this new application was constrained by the realities of the health care financing system.
Of course, the physicians needed to view medical image files in a high-resolution mode to make a competent diagnosis. Immediately, we encounter capacity constraints because digital imaging and display technology today runs at a much lower resolution than hardcopy x-ray photographs, for example. Taking this application, which demonstrably runs fine on a dedicated high-powered workstation, and adding the element of long-distance file access creates distinct difficulties. The core of the problem is that very large, high-resolution image files created in one location needed to be copied very rapidly and transmitted across existing telecommunication links. So, while this application seems like something that ought to be pretty simple, especially after viewing the vendor’s standalone workstation demonstration, there is more to it than meets the eye.
While techniques like caching recently used images on a local server’s disk could improve performance of the application once the image is loaded, there is no avoiding the fact that moving files that large across a wide-area network (WAN) link is slow. Either the organization must upgrade its communications links, or the users of the system must learn to accept some reasonablelimitations on the type of service that this application can deliver. An example of those limitations is that the consultation session be subject to some advance notice, say one hour, with a setup procedure to identify the images to be transferred.
Various cost/performance alternatives are customarily called service levels, where a process of negotiation between the user departments and the technical support team is used to reach a service level agreement. Such an agreement formalizes the expected relationship between system load, response time, and cost for a given application. We will have more to say on this subject later in this chapter.
Often, when the system is too slow and the productivity of the end users is negatively impacted, it is back to the drawing board to overhaul the application. Under these circumstances, it may be the performance analyst’s job to persuade the users that these are, in fact, reasonable limitations consistent with the cost to run the application and the current state of technology. When people understand that there is a price tag associated with their dream system, they tend to adopt a more sensible attitude. A few simple charts can be used to express the relationship between file transfer time and file size across various communication links. Then the costs associated with using the existing T1 network links, upgrading to DSL or DDS-3, or resorting to some other more exotic and expensive alternative can be compared and contrasted. If the performance analyst does a good job of laying out these alternatives, an agreement can usually be reached amicably.
If unsatisfactory performance is the result of poor design decisions, the performance analyst may be called upon to assist the developers responsible for re-engineering the application. In any event, it is during these times of perceived performance problems that the performance analyst is challenged to help the organization cope with adversity. There will be difficult decisions to be made in this hour of crisis that will benefit from concise and objective analysis.
The best argument for the continued relevance of performance analysis today is the Internet. Growth in the use of the Internet is one of those phenomena that few predicted. What started out as a network of interconnected computers tied together as a way for researchers at the Department of Defense and various universities to communicate has within a few short years become a major public access network that everyone with a computer and a modem can use. That explosive growth has led to extremely serious capacity problems heightened by the highly visible nature of this network. It is front-page news in USA Today or the Wall Street Journal when major performance problems on the Internet are exposed.
Internet traffic is processed by a set of backbone service providers who switch packets onto the public phone system. In 1996, digital traffic on the public phone system exceeded analog voice traffic for the first time. Moreover, Internet data traffic has continued to grow at an unprecedented rate, doubling approximately every two to three months, according to some experts. Obviously, network capacity planning in the Internet Age is a considerable challenge.
Designated Internet service providers (ISPs) use banks of routers to examine packets and switch them accordingly. The underlying network protocol used to support the Internet is known as TCP/IP, which is being used today in ways its original designers could not have anticipated; for example, voice and video data streaming traffic over TCP/IP was not even contemplated when the protocol was conceived. In router design, if there is an overload, IP packets are simply dropped rather than queued at the router. This technique relies on the fact that TCP will ultimately recover and retransmit the information. Meanwhile, neighboring routers are sensitive to the fact that packets dropped at one node may seek alternative routes. The net result is that an isolated, overloaded resource sometimes causes flares or even more widespread storms that can impact service regionally, which is how these disruptions in service levels are characterized.
Identifying the root cause of these problems is neither easy nor straightforward. Many different vendors, including various telephone companies and ISPs, “own” portions of the Internet backbone. Even as these vendors are scrambling to add capacity to fix these problems, there is considerable jockeying among these vendors as to the root causes of the problem and how to pay for fixing them. Aspects of web server design, web browser design, and the TCP/IP protocol itself have been identified as contributing factors.
The flat-fee arrangements most users have with consumer-oriented ISPs for access privileges also contribute to the problem. People browsing the Internet are the only class of consumer on the public-access telephone network that do not pay for service based on the traffic they generate. As Internet users run more bandwidth-hogging applications like MP3 audio, telephony, and on-demand video streaming, this structural problem interferes with the ability of ISPs to make rational resource allocation decisions, undercutting the industry’s Quality of Service (QoS) initiatives.
The performance problems that the Internet faces today are classic problems of scalability:
Technology that worked extremely well in prototype environments but has difficulty adapting to larger-scale workloads
Outrageous user expectations about web browser performance, with multimedia applications attempting to squeeze large amounts of data across low-speed modem links
The simple physical constraints involved in long-distance communications
However, the most interesting aspect of the plague of Internet performance problems is their high profile and visibility. Most of the performance problems we encounter are private, confined to a single company or user population. However, the Internet is a public network with tens of millions of users, and problems on the Internet are visible to the multitudes. In spite of the evident questions and concerns, there are a large number of companies that rely on the Internet to host basic business and commercial services. Given that a company with a web site has little influence over the factors that determine performance for customers who access that web site, the prudent thing to do is to hedge your bets. Many experts recommend augmenting the public network with private lines and switching gear that can be monitored and controlled with more precision.
If there is any lesson in all this, it is that performance management and capacity planning problems are not disappearing despite the miraculous improvements in computer hardware. If anything, larger-scale application deployment, the dispersion of applications across wide-area communication networks and heterogeneous systems, and an increased rate of change are making performance problems even more difficult to diagnose and solve today. Chapter 11 and Chapter 12 provide an in-depth discussion of large-scale networking and Internet performance issues, respectively. Satisfied that our time invested in understanding computer performance issues will be worthwhile in the years ahead, we can profitably turn our attention to the practical matter of assessing and using the tools of the trade.
 Simple bedtime story understanding belongs to a class of problems that evidently cannot be solved no matter how much computer power we throw at it. Much effort has been expended over the years debating whether certain problems that are apparently “solved” by humans with relative ease are even amenable to solution using the mathematical-logical processes inside a computer. Although very little progress towards finding a computerized solution to this class of problem has been made in some 50 years of research into Artificial Intelligence, many people devoted to the field are reluctant to give up just yet.