Load Balancing

Load balancing is a technique for improving performance when many activities are processed concurrently. These activities could be in separate processes on different machines, in separate processes on the same machine, or in separate threads within the same process. The architecture makes no difference to the basic guidelines.

To support load balancing, a standard design is to have:

  • One point of entry for all requests (the request queue)

  • One or more request-processor objects behind the queue

  • A mechanism for the queue to decide which request processor to hand a particular request to

You also need communication lines between the queue and processors and a way to internally identify requests, but this is an obvious part of the infrastructure. The decision mechanism is typically a simple load-balancing system that distributes requests to those available processors. The request processors specify when they are available or busy. When the queue has a request to process, it chooses the first available request processor. Some applications need more complex decision-making, and use a decision mechanism that allocates requests depending on the type of request.

Our main concern with this architecture is that the queue is a potential bottleneck, so it must pass on requests quickly and be fairly continually ready.[71] The pool of request processors behind the queue can be running in one or more threads or processes, usually one request processor per thread. The pool of threaded request ...

Get Java Performance Tuning now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.