Now that we understand what machine data is available to us, how do we get to this data? The good news is that Kubernetes makes most of this data readily available, you just need the right tool to gather and view it. The solution we will discuss here heavily utilizes open source tools for collection and data enrichment because of their deep integrations and overwhelming community support.
Open-source data collection
Perusing the CNCF website, we quickly discover a wealth of tools built around Kubernetes to enable not only monitoring but networking, storage, and security. There is even a Kubernetes-specific package manager, Helm, to make the deployment and management of these resources easy and consistent.
There are a couple of key benefits to taking advantage of open-source collectors.
- They stay up to date. Each of these tools benefits from deep community support. As new versions of Kubernetes are released, the extensive use of each of these tools ensures they are quickly updated in turn.
- They integrate with everything. Regardless of your unique stack, it is likely that there is support for what you might want to export data from. The importance of these integrations cannot be overstated, as they enable the flexibility needed to grow and evolve a Kubernetes deployment over time.
Prometheus is the de facto tool of choice for metrics monitoring endorsed by the CNCF. It has a huge following and extensive support for anything you might want to collect metrics data from.
Prometheus works by pulling data from all of the components and jobs running in Kubernetes. Every component of Kubernetes exposes its metrics in a Prometheus format. The running processes behind those components serve up the metrics on an HTTP URL. For example, the Kubernetes API Server serves its metrics on https://$API_HOST:443/metrics. Prometheus is particularly good at auto-discovering the jobs and services currently running in a Kubernetes cluster. As pods are added, removed, or restarted, the Kubernetes Service construct keeps track of what pods exist for a given service. This auto-discovery capability is one of the primary reasons for Prometheus’ popularity, ensuring that all new and existing components are monitored.
Kubernetes does not define a single standard approach to log collection, but the most common method is called cluster level logging. Cluster level logging deploys a node level logging agent to each node which then funnels data to a separate backend for storage and analysis of logs. The primary benefit of this solution is that if a pod dies, the logs detailing what happened are retained. Implementing node level logging, without funneling data to a logging backend, will not retain log data if pods are evicted or die. Cluster level logging ensures that data is captured and retailed. A common tool for implementing cluster level logging is Fluentd — or Fluentbit, a lightweight version of Fluentd — which acts as the node level logging agent funneling data to a logging backend, like Sumo Logic.
Events provide insight into decisions being made by the cluster and unexpected events that occur in Kubernetes. Events are stored the API server on the master node, and collected using the same method as log collection — via a node level logging agent like Fluentd.
Setup using Helm
Finally, collectors for logs, metrics, events, and security can be easily deployed using Helm—an open source Kubernetes package manager. Helm can significantly simplify the setup process, reducing hundreds of lines of configuration to one. These collection plugins can be used on any Kubernetes cluster, whether one from a managed service like Amazon Elastic Kubernetes Service (EKS) or a cluster you are running entirely on your own.
Kubernetes Observability - Free ebook
Monitoring, troubleshooting and securing Kubernetes with Sumo Logic