System Design Concepts ( Part 2 : Load Balancer )

June 20, 2017

Load balancing refers to efficiently distributing incoming network traffic across a group of backend servers, also known as a server farm or server pool.Load balancing improves responsiveness and increases availability of applications.

A load balancer is a device that distributes network or application traffic across a cluster of servers.

What does load balancer exactly do ?

A load balancer sits between the client and the server farm accepting incoming network and application traffic and distributing the traffic across multiple backend servers using various methods. By balancing application requests across multiple servers, a load balancer reduces individual server load and prevents any one application server from becoming a single point of failure, thus improving overall application availability and responsiveness.

Why do we need Load Balancer ?

Traffic volumes are increasing and applications are becoming more complex. Load balancers provide the bedrock for building flexible networks that meet evolving demands by improving performance and security for many types of traffic and services, including applications.

Load balancing is the most straightforward method of scaling out an application server infrastructure. As application demand increases, new servers can be easily added to the resource pool, and the load balancer will immediately begin sending traffic to the new server. Core load balancing capabilities include:

Layer 4 (L4) load balancing - the ability to direct traffic based on data from network and transport layer protocols, such as IP address and TCP port

Layer 7 (L7) load balancing and content switching – the ability to make routing decisions based on application layer data and attributes, such as HTTP header, uniform resource identifier, SSL session ID and HTML form data

Global server load balancing (GSLB) - extends the core L4 and L7 capabilities so that they are applicable across geographically distributed server farms

How does Load Balancer works

When one application server becomes unavailable, the load balancer directs all new application requests to other available servers in the pool.

To handle more advanced application delivery requirements, an application delivery controller (ADC) is used to improve the performance, security and resiliency of applications delivered to the web. An ADC is not only a load balancer, but a platform for delivering networks, applications and mobile services in the fastest, safest and most consistent manner, regardless of where, when and how they are accessed.

The basic application delivery transaction is as follows:

The client attempts to connect with the service.
The ADC accepts the connection, and after deciding which host should receive the connection, changes the destination IP (and possibly port) to match the service of the selected host (note that the source IP of the client is not touched).
The host accepts the connection and responds back to the original source, the client, via its default route, the ADC.
The ADC intercepts the return packet from the host and now changes the source IP (and possible port) to match the virtual server IP and port, and forwards the packet back to the client.
The client receives the return packet, believing that it came from the virtual server, and continues the process.

Load balancing algorithms and methods

Load balancing uses various algorithms, called load balancing methods, to define the criteria that the ADC appliance uses to select the service to which to redirect each client request. Different load balancing algorithms use different criteria.

The Least Connection Method

The default method, when a virtual server is configured to use the least connection, it selects the service with the fewest active connections.

The Round Robin Method

This method continuously rotates a list of services that are attached to it. When the virtual server receives a request, it assigns the connection to the first service in the list, and then moves that service to the bottom of the list.

The Least Response Time Method

This method selects the service with the fewest active connections and the lowest average response time.

The Least Bandwidth Method

This method selects the service that is currently serving the least amount of traffic, measured in megabits per second (Mbps).

The Least Packets Method

This method selects the service that has received the fewest packets over a specified period of time.

The Custom Load Method

When using this method, the load balancing appliance chooses a service that is not handling any active transactions. If all of the services in the load balancing setup are handling active transactions, the appliance selects the service with the smallest load.

Hardware vs. Software Load Balancing

Load balancers typically come in two flavors: hardware-based and software-based. Vendors of hardware-based solutions load proprietary software onto the machine they provide, which often uses specialized processors. To cope with increasing traffic at your website, you have to buy more or bigger machines from the vendor. Software solutions generally run on commodity hardware, making them less expensive and more flexible. You can install the software on the hardware of your choice or in cloud environments like AWS EC2.

This cover the concept of Load Balancer. If you want to add anything or find anything incorrect, then please comment below.

Search This Blog

CodingFewer