Kubernetes is great but complex!

Whether to enable hybrid and multi-cloud, promote deeper specialization among development teams, enhance reliability, or simply stay ahead of the curve, organizations are reaping the varied benefits of this technology investment— but it comes at a cost. With each optimization, there are tradeoffs. With each layer of abstraction comes less visibility, resulting in more complexity when something goes wrong. As organizations race to adopt Kubernetes, unique challenges emerge that stretch the limits of existing monitoring solutions.

There are many more things to monitor

Instead of monitoring a static set of physical or virtual machines, containers are orders of magnitude more numerous with much shorter lifespans. Thousands of containers now live for mere minutes while serving millions of users across hundreds of services. In addition to the containers themselves, administrators must also monitor the Kubernetes system and its many components, ensuring they are all operating as expected. When trying to display the sheer volume of information pouring out of a containerized environment, most tools come up short.

Everything is ephemeral

Everything in Kubernetes is, by design, ephemeral. Kubernetes achieves its elastic ability to scale and contract by taking control over how pods—and the containers within those pods—are deployed. A job needs to be done and Kubernetes schedules a pod. When the job is complete, the pod is destroyed just as freely. But zoom out and we notice that Kubernetes has made the nodes replaceable as well. A server dies and pods are rescheduled to available nodes. Zoom out yet again to the clusters and these too are just as easily replaced.

You have to zoom all the way out to the services to find a component with any staying power inside of Kubernetes. Services and deployments represent the core application. They still change but much less than their underlying components. Most tools weren’t designed to look at an environment from the perspective of these logical abstractions. But these logical abstractions are how Kubernetes organizes itself. Kubernetes has different hierarchies — services, namespace, deployment, or node centric views. Tools should have the flexibility to view Kubernetes through these various lenses.

Aggregate pod data to align with various Kubernetes hierarchies.

Tools are distributed

Between logging tools, metrics tools, GitHub, and even SSH, engineers are constantly switching between a variety of tools to gain a complete picture of their system, i.e., observability. Walking through a typical alert investigation, we can quickly get a sense of this. An alert comes in and we immediately go check the logs to find out more about the specific problem. Running through a mental checklist of potential problems, we log into GitHub to see if any new code has been pushed. Did Kubernetes make any scheduling decisions? What are the upstream and downstream dependencies of the error I am seeing? And so on. Rarely are the answers to the puzzle nicely connected and in one place. But the more they are, the quicker we can resolve the issue.

Slack/Pager duty - Get an Application alert
Logging backend - Check application logs
GitHub - Check the Kubernetes configuration (Limits or requested settings)
Metrics Monitoring backend - Check for events that happened in Kubernetes.
Metrics Monitoring backend/ Kubectl - Did Kubernetes make any scheduling decisions
GitHub - Check GitHub to see if new code was pushed
Mental Model - Think through the application mental model for upstream and downstream dependencies
Metrics Monitoring backend - Check metrics to compare if the problem is in production and dev environments
Cloud provider - Check with the cloud provider to see if limits are being hit
Metrics Monitoring backend - Check metrics at the node / server/vm level.
Kubectl- Check Kernel metrics
SSH - Check pod and node networking

Complete visibility for DevSecOps

Reduce downtime and move from reactive to proactive monitoring.

Start free trial

概要

オペレーションインテリジェンス

セキュリティインテリジェンス

役割別

業種

テクノロジー別

ユースケース別

リソースセンター

当社にお任せください

About Sumo Logic

パートナープログラム

Challenges of Monitoring and Troubleshooting in Kubernetes Environments

Complete visibility for DevSecOps

部門

スポットライト

Navigate Kubernetes with Sumo Logic

Katie Lane

これを読んだ人も楽しんでいます

All you need to know about HAProxy log format

Understanding the Apache Access Log: View, Locate and Analyze

What is Apache Web Server? In-Depth Overview

概要

オペレーション インテリジェンス

セキュリティ インテリジェンス

役割別

業種

テクノロジー別

ユースケース別

リソースセンター

当社にお任せください

About Sumo Logic

パートナープログラム

Complete visibility for DevSecOps

部門

スポットライト

シェア

Navigate Kubernetes with Sumo Logic

Katie Lane

世界中で2,100社を超える企業がSumo Logicを採用し、クラウド上でのアプリケーションの運用とセキュリティ保護を行っています。

オペレーションインテリジェンス

セキュリティインテリジェンス