Anatomy of a Distributed Application

A distributed application is built upon several layers. At the lowest level, a network connects a group of host computers together so that they can talk to each other. Network protocols like TCP/IP let the computers send data to each other over the network by providing the ability to package and address data for delivery to another machine. Higher-level services can be defined on top of the network protocol, such as directory services and security protocols. Finally, the distributed application itself runs on top of these layers, using the mid-level services and network protocols as well as the computer operating systems to perform coordinated tasks across the network.

At the application level, a distributed application can be broken down into the following parts:

Processes

A typical computer operating system on a computer host can run several processes at once. A process is created by describing a sequence of steps in a programming language, compiling the program into an executable form, and running the executable in the operating system. While it’s running, a process has access to the resources of the computer (such as CPU time and I/O devices) through the operating system. A process can be completely devoted to a particular application, or several applications can use a single process to perform tasks.

Threads

Every process has at least one thread of control. Some operating systems support the creation of multiple threads of control within a single process. Each thread in a process can run independently from the other threads, although there is usually some synchronization between them. One thread might monitor input from a socket connection, for example, while another might listen for user events (keystrokes, mouse movements, etc.) and provide feedback to the user through output devices (monitor, speakers, etc.). At some point, input from the input stream may require feedback from the user. At this point, the two threads will need to coordinate the transfer of input data to the user’s attention.

Objects

Programs written in object-oriented languages are made up of cooperating objects. One simple definition of an object is a group of related data, with methods available for querying or altering the data (getName(), set-Name()), or for taking some action based on the data (sendName(Out-putStream o)). A process can be made up of one or more objects, and these objects can be accessed by one or more threads within the process. And with the introduction of distributed object technology like RMI and CORBA, an object can also be logically spread across multiple processes, on multiple computers.

Agents

For the sake of this book, we will use the term “agent” as a general way to refer to significant functional elements of a distributed application.[1] While a process, a thread, and an object are pretty well-defined entities, an agent (at least the definition we’ll use for the sake of this book) is a higher-level system component, defined around a particular function, or utility, or role in the overall system. A remote banking application, for example, might be broken down into a customer agent, a transaction agent and an information brokerage agent. Agents can be distributed across multiple processes, and can be made up of multiple objects and threads in these processes. Our customer agent might be made up of an object in a process running on a client desktop that’s listening for data and updating the local display, along with an object in a process running on the bank server, issuing queries and sending the data back to the client. There are two objects running in distinct processes on separate machines, but together we can consider them to make up one customer agent, with client-side elements and server-side elements.

So a distributed application can be thought of as a coordinated group of agents working to accomplish some goal. Each of these agents can be distributed across multiple processes on remote hosts, and can consist of multiple objects or threads of control. Agents can also belong to more than one application at once. You may be developing an automated teller machine application, for example, which consists of an account database server, with customer request agents distributed across the network submitting requests. The account server agent and the customer request agents are agents within the ATM application, but they might also serve agents residing at the financial institution’s headquarters, as part of an administrative application.



[1] The term “agent” is overused in the technology community. In the more formal sense of the word, an agent is a computing entity that is a bit more intelligent and autonomous than an object. An agent is supposed to be capable of having goals that it needs to accomplish, such as retrieving information of a certain type from a large database or remote data sources. Some agents can monitor their progress towards achieving their goals at a higher level than just successful execution of methods, like an object. The definition of agent that we’re using here is a lot less formal than this, and a bit more general.

Get Java Distributed Computing now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.