Photo credit: Christopher Gower
What does it mean by scalability?
Introduction to Load Balancing?
Imagine you have N number of servers. How would we know, for each new incoming Web Request that we’re receiving, which server should we send the request to? That is what a Load Balancer does. Load Balancers help us route requests in an optimal manner and distribute the weight across multiple servers.
How does it work?
It starts with allocating a Uniformly Random Request ID from 0 to N-1 where N is the total number of servers. For each request ID Ri, we proceed to do a hash on this ID to get a hashed number mi. h(r1) => m1. Each number m1 can be mapped to the different servers as you take m1 % n and whatever result you get, we send it to the server whose index matches the result. This concept is called Consistent Hashing.
Because this hash function is uniformly random, we can expect all servers to have the same load. So, what happens when we need to add more servers? How does this affect our hash function and Load Balancer?
Recall that h(r1) => m1, and server index = m1 % N, where N is the total number of servers. However, now that N has changed, the server index will change accordingly.
Things to avoid
1) Huge changes in the server indexs that you are serving