Load Balancing Pattern – Understanding the mechanism of Load Balancing in Cloud Computing

As information technology grows, and for that raison cloud computing has emerged as a main option for developers in developing a high-performance applications. Throughout time cloud technology has known an increased use of cloud infrastructure which led to implementing a load balancing mechanism in order to optimize the use of resources and improve performance for users.

In this article, we’re going to understand load balancing as a key part of microservice architecture, why it is so important to implement a load balancer for a better productivity particularly in huge microservice application.

Load balancing algorithms can be categorized into two types: static and dynamic algorithms. therefore, we’re going to tackle the most known load balancing algorithms to implement in cloud computing such as Round Robin and Weighted Round Robin, then we’re going to identify which parameters, we can measure to gauge the performance of the load balancing algorithm, which allows checking whether the given method is worthy of balancing the load or not. And finally, we’re going to demonstrate an example using Ribbon load balancer from Spring Cloud ecosystem in a microservice architecture.

Microservice cloud-based architecture consists of many components that communicate with each other, before tackling this article about Load Balancing approach, make sure to understand other components like Service Discovery and Zuul Proxy. After reading this article, you’ll master, load balancing mechanism and setting-up Ribbon load balancing in a microservices cloud-based application.

What is Load Balancing?

Load Balancing is the efficiently distribution of traffic requests across a group of servers called backend servers. The main purpose of load balancing is to prevent any server from getting overloaded and possibly crashed.

What is the need for load balancing?

Load balancing is a vital component in a microservices architecture, normally an application that relies on a microservice architecture, it may scale-up the infrastructure which means each service owns many instances. therefore, load balancing finds its way as the ultimate solution which consists of managing incoming requests among backend servers.

By implementing this pattern in a microservice cloud-based architecture, it will ensure availability of backend servers for all incoming requests, thus if one of these backend servers goes down, load balancer will automatically redirect traffic to remain operational servers. This technique aims to reduce response time and increase throughput.

Load Balancing Algorithms

Round Robin Algorithm

Round Robin algorithm relies on a time-sharing systems strategy, it uses a little unit of time called a time quantum or time cut, each prepared line it consists of a bunch of procedures, then the CPU scheduler picks a procedure from the prepared line, sets a time quantum (TQ) to hinder after sometime.

If the burst time (BT) is longer than the quantum time (QT) and the procedure is still running, then a context switch will be executed, and the procedure will be put at the tail of the prepared line. The CPU scheduler will then choose the following procedure in the readied line.

In this subsection, we mentioned step by step implementation of the Round Robin Algorithm, and the figure described the basic

Round Robin Load Balancing Algorithm
Round Robin Load Balancing Algorithm
  1. The first incoming request comes to Load Balancer.
  2. Load Balancer gets all available services from service discovery and then it will choose to which node it will address the incoming request to.
  3. The first request of the user is assigned to any random VM. 4. Once the first request is assigned, virtual machines are ordered in a cyclic manner.
  4. Virtual machine which received the first user request is moved back.
  5.  The next request of users is assigned to the next VM in cyclic order.
  6. Go to Step 3 for each user request until Load Balancer processes all requests.

Weighted Round Robin Algorithm

The main drawback of RR load balancing algorithm was it did not take any consideration of the size of the client request in order to assign it to the appropriate server. It only chooses the next server by using simply the rotation cycle.

The weighted RR relies on the load of the server along with the processing capacity + the long of the task to decide which server will allocate the processing task. Based on this information, weighted RR will calculate the waiting time of each virtual machine then a weight is given to each virtual machine. And of course, a virtual machine with less waiting time has more weight.

Enhanced Round Robin Load Balancing Algorithm
Enhanced Round Robin Load Balancing Algorithm

The above figure shows all different steps of Weighted Round Robin load balancing algorithm that is summarized as follow:

Step 1: From the incoming request WRR recognize the length.
Step 2: Based on the waiting time of each virtual machine, calculate the load for each virtual machine.
Step 3: depending on the waiting time calculated on Step 2, a weight is given to each virtual machine. A virtual machine with less waiting time has more weight.
Step 4: Arrange the virtual machine weight in there descending order.
Step 5: Assign an incoming task or cloudlet to virtual machine which has more weight or less waiting time. More tasks are assigned to virtual machine with more weight or less waiting time than the others.
Step 6: Check the status of the virtual machine. If find the over loaded virtual machine
Step 7: Select the virtual machine which has less waiting time and assigned the task for that virtual machine.
Step 8: Continue the process till it has overloaded virtual machines.
Step 9: Go back to Step 2.

Load Balancing Measurement Parameters

The main purpose of this section is to identify some of the measurement parameters in order to evaluate the performance of a load balancing approach. Which is given as follows:

  • Throughput – this parameter is very important in supporting the performance of a backend server, it measures how much weight a server can handles. In other words, this parameter is calculated as the number of requests in a given time.
  • Fault tolerance – the capability of the load balancing algorithm to distribute equally across different nodes after a node goes down.
  • Scalability – usually, a microservice cloud-based application contains a cluster of servers called back-end servers, in a requisite situation, developers will need to scale the system, which means enlarge by adding more machines to the pool of resources. In this new case, the load balancing algorithm should be able to scale according to the new infrastructure by sharing equally across all new added nodes.

Load Balancing Categories

Server-Side Load Balancing

Most digital companies implement microservices approach, and as the client needs grows, companies tend to enlarge their system infrastructure, which mean adding more applications (nodes) and more instances to provide availability and improve performance. Now the question that must be asked, which type of load balancing must be implemented?

Server-Side Load Balancing Architecture
Server-Side Load Balancing Architecture

In most case scenarios, companies implement Server-Side approach, where all incoming requests goes through one single load balancer. Now imagine that one single load balancer which is a one single entry to the backend servers crashed. Then, all incoming requests will not be able to reach their destinations.

The solution for that issue, is to adopt Client-Side approach

Before tackling the next solution for that problem, we’re going to mention the different steps of Server-Side approach as follow:

We will resume the steps represented for Client-Side Server at the figure above as following:

1- The client addresses a request to the load balancer
2- Load balancer sends a request to the Service Discovery to get the updated list of all the available applications or nodes.
3- Service Discovery send back a list to the load balancer, of all available servers or applications.
4- Now, the load balancer is able to send the client request to the appropriate server to be served.

Client-Side Load Balancing

Client-Side Server load balancing approach guarantee that each request reaches its destinations, how is that work?

Client-Side Server Load Balancing Architecture
Client-Side Server Load Balancing Architecture

Fig 1.4 Client-Side Server Load Balancing Architecture

We will resume the steps represented for Client-Side Server at the figure above as following:

5- The client addresses a request to the Service Discovery to keep the routing data cached updated
6- Service discovery sends back a response of the updated list that concerns all the available applications or nodes.
7- Now, the client can address the request to the appropriate application to be served.

What is Ribbon?

Ribbon is an open-source component provided by Netflix, it uses client load balancing algorithm to balance load traffic coming from users across multiple backend servers as show in the architecture below.

Ribbon Load Balancer in a microservices architecture
Ribbon Load Balancer in a microservices architecture

1- Ribbon Load Balancer receives request from client
2- Ribbon load balancer as shown above, will then ask service discovery for all available services.
3- Service discovery sends back a list of all available service to Ribbon Load balancer


Load balancing is a very important pattern of the microservice-cloud based approach, and in order to be a suitable solution to improve performance and guarantee availability, it should obey to certain characteristics such as: throughputs, scalability, waiting time, response time.

In the next topic, we’re going to understand all the steps of how to implement Ribbon Load Balancer in a microservices cloud-based architecture.

Add a Comment

Your email address will not be published.