Create and Use a Load Balancer with Kubernetes

For all but the simplest Kubernetes configurations, efficiently distributing client requests across multiple clusters should be a priority. A load balancer routes requests to clusters in order to optimize performance and ensure the reliability of your application. With a load balancer, the demands on your application are shared equally across clusters so that all available resources are utilized and no single cluster is overburdened.

What is Load Balancing on Kubernetes?

To understand load balancing on Kubernetes, you must first understand some Kubernetes basics: a “pod” in Kubernetes is a set of containers that are related in terms of their function, and a “service” is a set of related pods that have the same set of functions. This level of abstraction insulates the client from the containers themselves. Pods can be created and destroyed by Kubernetes automatically, and they are not expected to be persistent. Since every new pod is assigned a new IP address, IP addresses for pods are not stable; therefore, direct communication between pods is not generally possible. However, services have their own IP addresses which are relatively stable; thus, a request from an external resource is made to a service rather than a pod, and the service then dispatches the request to an available pod.

An external load balancer will apply logic that ensures the optimal distribution of these requests. In order to create one, your clusters must be hosted by a cloud provider or an environment which supports external load balancers and is configured with the correct cloud load balancer provider package. This includes major cloud providers such as AWS, Azure, and GCP. You will also need to install and configure the kubectl command-line tool to communicate with your cluster.

How to Use an External Load Balancer in Kubernetes

To take advantage of the load balancer that is available in your host environment, simply edit your service configuration file to set the “type” field to “LoadBalancer.” You will also need to specify a port value for the “port” field. This provides an externally-accessible IP address that sends traffic to the correct port on your cluster nodes to be accessed by the external load balancer provided by your cloud provider. Alternatively, you can create the service with the kubectl expose command and its --type=LoadBalancer flag, as documented here. Once you deploy the changes to your config file, use kubectl get services to see the status of the load balancers you’ve provisioned.

kubectl --kubeconfig=[full path to cluster config file] get services

If you have provisioned multiple load balancers but want to get the status of a specific one, you can use the “describe service” command instead. Both of these commands will expose the name, cluster IP address, external IP address, port, and age of your load balancer(s). If using the “describe service” command, the IP address is next to “LoadBalancer Ingress”.

External Load Balancer Alternatives

Sometimes, a load balancer is not provisioned for a cluster in order to efficiently manage traffic leading into it, but simply to expose it to the internet. If this is your use case, you may consider setting the “type” field to “NodePort” instead. Kubernetes will then choose an available port to open on all your nodes so that any traffic sent to this port passes through to your application. While this doesn’t provide any load balancing, it is a cheap way to route external traffic directly to your service if that’s all you require.

On the other hand, if you are trying to optimize traffic to multiple services, you may consider a more robust method than the LoadBalancer type suggested above. You will be charged by your cloud provider for each service that requires an external load balancer, as well as for each IP address provisioned for your balancers. Another strategy is to use Ingress, which allows you to expose multiple services under the same IP address. Ingress runs as a controller in a specialized Kubernetes pod that executes a set of rules governing traffic. With Ingress, you only pay for one load balancer. There are many types of Ingress controllers and your implementation will depend on your environment, but it is safe to say that deploying Ingress requires a more complicated configuration than the process given above. Thus, you must weigh the potential cost savings against the increased complexity.

Monitoring your Load Balancer

With your load balancer configured, you can trust that requests to your services will be dispatched efficiently, ensuring smoother performance and enabling you to handle greater loads. For many of us, however, trust must be earned, and “seeing is believing,” as the old adage goes. If you want to see your load balancer in action so that you know it’s operating as it should, you need a way to visualize your Kubernetes services and monitor their performance. The Sumo Logic Kubernetes App provides visibility on all your nodes, allowing you to monitor and troubleshoot load balancing as well as myriad other metrics to track the health of your clusters. You can track the loads being placed on each service to verify that resources are being shared evenly, and you can also monitor the load balancing service itself from a series of easy-to-understand dashboards.

If your application must handle an unpredictable number of requests, a load balancer is essential for ensuring reliable performance without the cost of over-provisioning. Depending on the number of services you need to route traffic to, as well as the level of complexity you are willing to accept, you might choose to use Ingress or the external load balancing service offered by your cloud provider. Regardless of how you implement your load balancing, monitoring its performance through the Sumo Logic Kubernetes App will allow you to measure its benefits and quickly react when it is not operating as it should.