Load balancers have evolved considerably since they were introduced in the 1990s as hardware-based servers or appliances. Cloud load balancing, also referred to as Load Balancing as a Service (LBaaS), is an updated alternative to hardware load balancers. Regardless of the implementation of a load balancer, scalability is still the primary goal of load balancing, even though modern load balancers can do so much more.

Optimal load distribution reduces site inaccessibility caused by the failure of a single server while assuring consistent performance for all users. Different routing techniques and algorithms ensure optimal performance in varying load-balancing scenarios.

Modern websites must support concurrent connections from clients requesting text, images, video, or application data, all in a fast and reliable manner, while scaling from hundreds of users to millions of users during peak times. Load balancers are a critical part of this scalability.

  • Problems Load Balancers Solve
  • The Solutions Load Balancers Provide
  • The OSI Model and Load Balancing

Problems Load Balancers Solve

In cloud computing, load balancers solve three issues that fall under:

  • Cloud Bursting
  • Local Load Balancing
  • Global Load Balancing

Cloud bursting is a configuration between a private cloud (i.e. on-prem compute environment) and a public cloud that uses a load balancer to redirect overflow traffic from a private cloud that has reached 100% of resource capacity to a public cloud to avoid decreases in performance or an interruption of service.

The critical advantage of cloud bursting is economical where companies do not have to provision or license excess capacity to meet limited time peek loads or unexpected fluctuations in demand. This flexibility and automated self-service model of the cloud means that only the resources consumed for a specific period are paid for until released again.

Local load balancing can be used within a private cloud and a public cloud and is a fundamental infrastructure requirement for any web application that needs high availability and the ability to distribute traffic across several servers.

Global load balancing is much more complex and may involve several layers of load balancers that manage traffic across multiple private clouds, public clouds and public cloud regions. While state synchronization challenges are not unique to global load balancing, the widely distributed nature of a global scale solution introduces latency and regional resource resiliency that requires various complex solutions to meet service level agreements (SLAs).

Solutions Load Balancers Provide

The choice of a load balancing method depends on the needs of your application to serve clients. Different load balancing algorithms provide different solutions based on application and client needs:

  • Round Robin: Requests are queued and distributed across the group of servers sequentially.
  • Weighted Round Robin: A Round Robin, but some servers get a larger share of the overall traffic based on computing capacity or other criteria.
  • Weighted Least Connections: The load balancer monitors the number of open connections for each server and sends it to the least busy server. The relative computing capacity of each server is factored into determining which one has the least connections.
  • Hashing: A set of header fields and other information is used to determine which server receives the request.
  • Session persistence also referred to as a “sticky session” refers to directing incoming client requests to the same backend server for the duration of a session by a client until the transaction being performed is completed.