The History of the Cloud, Part I – Server Load Balancing
Isn’t it nice to know that when you are traveling, you have access to digital maps and precise location technologies like GPS? It is even nicer that with a mobile connection, your map can update your route in real-time based on traffic and road conditions. If there is an accident or road work, the navigation system can automatically calculate a new route to optimize your route to the desired destination.
Imagine if we could apply these concepts to navigating to the various pieces of content on the Internet (and intranet). This would be a great tool to ensure our ability to get to the sites we are looking for.
The ‘cloud’ is designed to provide a similar capability for the Internet. When an application or data is ‘in the cloud’ we assume that it is available anytime, from anywhere. Before we started utilizing cloud services in earnest, we were using proto-cloud technologies that are now a core part of most, if not all cloud architectures.
Before the cloud there was…
You may be surprised that one of the key technologies required to provide this capability has been around, in one form or another, since 1996. Just because a technology is a mature technology and has been around for a long time does not mean that it is a well-known or well understood technology. In this case, I am talking about server load balancing (SLB). As the networks grew in the 1990s, the demand for applications surpassed the capabilities of individual servers. We needed a way to take multiple requests for a similar resource and distribute them effectively across multiple servers to share the load.
A server load balancer, otherwise known as a load balancer, is a traffic manager, a traffic inspector, and a traffic manipulator. An SLB acts on behalf of the applications it is configured for. It accepts the connections for an application by hosting an IP address that is advertised to the network community. Once the load balancer receives a request to connect to the application, it decides which actual server to send the request to from a pool of available servers and re-addresses the original request to that actual server. The server load balancer will now act as a middleman, monitoring and proxying the data for the entire connection as long as it is active.
In technical terms, the SLB is a reverse-proxy hosting a virtual IP address (VIP) and doing destination network address translation (NAT) from the VIP to the real server’s IP address. For the life of the connection, the SLB will maintain a session table entry and continue to map that specific connection to the real server to maintain that session persistence.
With SLB technologies, it is now possible to scale a single application beyond the performance capabilities of a single server easily and efficiently. An SLB instance can host a single VIP that is responsible for dozens of real servers.
As SLB technologies evolved, three core sets of functionality were enhanced. First, the load balancer needed a way to monitor the status of the real servers it was sending connections to. Initially, there were simple health checks such as an ICMP ping to see if the server was responding on the network, or a simple HTTP GET / request to ensure that the application was responding. Today, the health checks can monitor the server resources through advanced scripts and queries to the server and application. Other metrics can be used such as the response time of the application or the capacity of the server relative to the capacity of the other servers in the pool.
If a server fails the periodic health checks, the load balancer can stop sending connections to the server. When the server becomes available, the health checks will succeed again, and the load balancer will start sending clients to the newly available server. This means that failures will not disrupt the application availability. This also means that if more servers are needed to scale the application, they can be easily be added without any noticeable impact to the clients.
Second, the load balancer started becoming more intelligent in the way it recognized connection requests and distributed those connections to the real servers. Since the load balancer is proxying the connection, there is the opportunity to inspect the request to determine if there were special handling procedures necessary. The load balancer can have different sub-pools of servers for an application based on the type of content. Static content requests can go to a different sub-pool versus dynamic, scripted content. Different HTTP user-agents can be directed towards different servers. The servers can become more specialized and it is possible to fine tune the server design based on its functions.
Last, since the load balancer is acting as a proxy and can inspect and manipulate the content, it can transparently enhance the delivery of the content. Technologies such as HTTP multiplexing to reuse TCP connections, TCP optimization, inline compression, and SSL termination offloading are all functions that the load balancer has incorporated to improve the delivery of the application content to the consumer.
Today, load balancers have evolved to become application delivery controllers (ADC) that do much more than standard load balancing leveraging the fact that they are proxying the application connection. Application-centric functions like web application firewall (WAF) and single sign on (SSO) have become part of the ADC portfolio.
The van der Waals force of the IT cloud
Ultimately, all of these technologies and features are designed to ensure that within a datacenter, an application is available and scalable. For high availability scenarios, pairs of load balancers can be configured in an active-standby or active-active model to ensure that no individual point of failure disrupts access to the application. The capabilities of the load balancer create the core functions that modern clouds need to properly work. To achieve the agility and elasticity that we expect in today’s cloud architectures, we need to remember that the SLB technologies we were using before the evolution of today’s cloud are still a key part of the standard cloud architectures.