Chapter 1. What Load Balancing Is and
Why It’s Important

Load balancers have evolved considerably since they were introduced in the 1990s as hardware-based servers or appliances. Cloud load balancing, also referred to as Load Balancing as a Service (LBaaS), is an updated alternative to hardware load balancers. Regardless of the implementation of a load balancer, scalability is still the primary goal of load balancing, even though modern load balancers can do so much more.

Optimal load distribution reduces site inaccessibility caused by the failure of a single server while assuring consistent performance for all users. Different routing techniques and algorithms ensure optimal performance in varying load-balancing scenarios.

Modern websites must support concurrent connections from clients requesting text, images, video, or application data, all in a fast and reliable manner, while scaling from hundreds of users to millions of users during peak times. Load balancers are a critical part of this scalability.

Problems Load Balancers Solve

In cloud computing, load balancers solve three issues that fall under the following categories:

  • Cloud bursting

  • Local load balancing

  • Global load balancing

Cloud bursting is a configuration between a private cloud (i.e., on-premises compute environment) and a public cloud that uses a load balancer to redirect overflow traffic from a private cloud that has reached 100% of resource capacity to a public cloud to avoid decreases in performance or an interruption of service.

The critical advantage of cloud bursting is economic in the respect that companies do not need to provision or license excess capacity to meet limited-time peak loads or unexpected fluctuations in demand. This flexibility and the automated self-service model of the cloud means that only the resources consumed for a specific period are paid for until released again.

Organizations can use local load balancing within a private cloud and a public cloud; it is a fundamental infrastructure requirement for any web application that needs high availability and the ability to distribute traffic across several servers.

Global load balancing is much more complex and can involve several layers of load balancers that manage traffic across multiple private clouds, public clouds, and public cloud regions. The greatest challenge is not the distribution of the traffic, but the synchronization of the backend processes and data so that users get consistent and correct data regardless of where the responding server is located. Although state synchronization challenges are not unique to global load balancing, the widely distributed nature of a global-scale solution introduces latency and regional resource resiliency that requires various complex solutions to meet service-level agreements (SLAs).

The Solutions Load Balancers Provide

The choice of a load balancing method depends on the needs of your application to serve clients. Different load-balancing algorithms provide different solutions based on application and client needs:

Round robin

Requests are queued and distributed across the group of servers sequentially.

Weighted round robin

A round robin, but some servers are apportioned a larger share of the overall traffic based on computing capacity or other criteria.

Weighted least connections

The load balancer monitors the number of open connections for each server and sends it to the least busy server. The relative computing capacity of each server is factored into determining which one has the least connections.

Hashing

A set of header fields and other information is used to determine which server receives the request.

Session persistence, also referred to as a sticky session, refers to directing incoming client requests to the same backend server for the duration of a session by a client until the transaction being performed is completed.

The OSI Model and Load Balancing

The Open System Interconnection (OSI) model defines a networking framework to implement protocols in seven layers:

  • Layer 7: Application layer

  • Layer 6: Presentation layer

  • Layer 5: Session layer

  • Layer 4: Transport layer

  • Layer 3: Network layer

  • Layer 2: Data-link layer

  • Layer 1: Physical layer

The OSI model doesn’t perform any functions in the networking process. It is a conceptual framework to better understand complex interactions that are happening.

Network firewalls are security devices that operate from Layer 1 to Layer 3, whereas load balancing happens from Layer 4 to Layer 7. Load balancers have different capabilities, including the following:

Layer 4 (L4)

Directs traffic based on data from network and transport layer protocols, such as IP address and TCP port.

Layer 7 (L7)

Adds content switching to load balancing. This allows routing decisions based on attributes like HTTP header, URL, Secure Sockets Layer (SSL) session ID, and HTML form data.

Global Server Load Balancing (GSLB)

GSLB extends L4 and L7 capabilities to servers in different geographic locations. The Domain Name System (DNS) is also used in certain solutions and this topic is addressed when Azure Traffic Manager is used as an example of such an implementation.

As more enterprises seek to deploy cloud-native applications in public clouds, it is resulting in significant changes in the capability of load balancers.

Get Load Balancing in Microsoft Azure now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.