Now that we’ve defined some terms that can be used to discuss distributed applications, we can start to look at what goes into developing these applications. In this section we’ll discuss some of the issues that you face when developing distributed systems, and what kinds of tools and capabilities you’ll need in order to address these issues. The next section will describe how Java provides these tools and capabilities.
If you think of the computer hosts and network connections available for a distributed application to use as a “virtual machine,” then one of the primary tasks you have is to engineer an optimal mapping of processes, objects, threads and agents to the various parts of this virtual machine. In some cases, a straightforward client/server partitioning based on data requirements can be used. Computational tasks can be distributed based on the data needs of the application: maximize local data needed for processing, and minimize data transfers over the network. In other, more compute-intensive applications, you can partition the system based upon the functional requirements of the system, with data mapped to the most logical compute host. This method of partitioning is especially useful when the overhead associated with data transfers is negligible compared to the computing time spent at the various hosts.
In the best of all possible worlds, you could develop modules based upon either data- or functionally driven partitioning. You could then distribute these modules as needed throughout a virtual machine comprised of computers and communication links, and easily connect the modules to establish the data flow required by the application. These module interconnections should be as flexible and transparent as possible, since they may need to be adjusted at any point during development or deployment of the distributed system.
The type and format of the information that’s sent between agents in a distributed system is subject to many varied and changing requirements. Some of them are a result of the data/function partitioning issues discussed in the previous section. The allocation of tasks and data to agents in the distributed system has a direct influence on what type of data will need to be communicated between agents, how much data will be transferred, and how complicated the communication protocol between agents needs to be. If most of our data is sitting on the host where it’s needed, then communications will be mostly short, simple messages to report status, instruct other agents to start processing, etc. If central data servers are providing lots of data to remote agents, then the communication protocol will be more complex and connections between nodes in the system will stay open longer. You need to be able to implement various styles of communication, and adapt them to evolving requirements.
The communication protocols a given agent will need to understand might also be dictated by legacy systems that need to be incorporated into the system. These legacy systems might control data or functionality that’s critical to enabling a given system, but are not easily transferable to a new system. Support for these protocols should be available, or easily attainable, in your distributed application development environment. In the worst case, when support for a required protocol is unavailable due to its obscurity or the expense associated with the available support, you should have the option to develop the required protocol support yourself, and have a reasonable way of incorporating the extended communications abilities into the existing infrastructure.
Agents often have to execute several threads of control at once, either to service requests from multiple remote agents, or block on I/O while processing data, or for any number of other reasons. Multithreading is often an effective way to optimize the use of various resources, such as CPU time, local storage devices, or network bandwidth. The ability to create and control multiple threads of control is especially important in developing distributed applications, since distributed agents are typically more asynchronous than agents within a single process on a single host. The environments in which agents are running can be very heterogeneous, too, and we don’t want every agent in a distributed application to be a slave to the slowest, most heavily loaded agent in the system. We don’t want our multiprocessor compute server, for example, to be sitting idle while it waits for a slow client desktop to read and render the results of an analysis. We would want a single thread on the compute server to be servicing the slow client, and while the client is crawling along trying to read data and draw graphs on its display, other threads on the compute server can be doing useful work, like analyzing the data from other clients.
The information transactions that occur between computing agents often need to be secure from outside observation, when information of a sensitive nature needs to be shared between agents. In situations where an outside agent not under the host’s direct control is allowed to interact with local agents, it is also wise to have reasonable security measures available to authenticate the source of the agent, and to prevent the agent from wreaking havoc once it gains access to local processing resources. So at a minimum a secure distributed application needs a way to authenticate the identity of agents, define resource access levels for agents, and encrypt data for transmission between agents.