Glenrose Group
Posts
Highlights of State of Kubernetes Cost Optimization Report - Part 1

Highlights of State of Kubernetes Cost Optimization Report - Part 1

We take a look at Google's K8s report based on real-world large-scale anonymized data.

Elizabeth Flowers
July 04, 2023

Both the increase in container costs and the need to reduce waste are especially challenging to developers working with Kubernetes clusters, as Kubernetes is a complex and distributed container management system with many features that can lead to over-provisioning when not used correctly. Google’s State of Kubernetes Cost Optimization report analyzes real-world large-scale anonymized data using four Golden Signals to show the best practices for efficient management of Kubernetes clusters, without sacrificing workload reliability and performance.

Kubernetes cost-optimization “Golden Signals” are used by this report to segment clusters into different levels of cost optimization performance: Elite, High, Medium, Low, and At Risk. There are two separate types of cost savings: the Resources group focuses on getting the maximum performance out of the CPU and Memory you are paying for, and the Cloud Discount group focuses on taking advantage of cloud provider discounts.

Golden Signals Resources Group

Workload Rightsizing: Requesting the same amount of resources you actually use.

Elite performers use their clearer understanding of their applications’ appetite for resources in testing, staging, and production environments to avoid over- or under-provisioning their clusters.

Demand-Based Downscaling: Using fewer resources during low-demand periods.

Scaling down a cluster requires cooperation between developers and platform teams. If developers don’t scale down their workloads, platform teams’ options are limited. If, however, developers scale down their workloads during off-peak hours, platform admins can employ Cluster Autoscaler to meet business needs efficiently.

Cluster Bin Packing: Filling up nodes efficiently by fully allocating the CPU and memory of each node through Pod placement.

Developers and Platform teams have a shared responsibility to collaborate on finding the appropriate machine shape for various workloads.

Golden Signals Cloud Discounts Group

Discount Coverage: Using discounted cloud space wisely.

Discount coverage measures the capacity of platform admins to leverage machines that offer significant discounts, such as Spot VMs, as well as the capacity of budget owners to take advantage of long-term continuous use discounts offered by cloud providers.

BestEffort, Burstable, and Guaranteed Pods

BestEffort Pods don’t have any requests or limits set for their containers. These pods are killed first when a node runs out of memory. BestEffort pods are meant exclusively for running workloads where it is not a problem if the workload doesn't run right away, if it takes longer to finish, or if it is inadvertently restarted or killed.

Burstable Pods containers have resource requests with upper or unset limits. When these pods use more memory than they have requested, they are also first on the chopping block when the node is under resource pressure. Burstable pods should not constantly run over their requested resources - it is a feature meant to provide occasional boosts of additional resources.

Guaranteed Pods have an equal amount of requested resources and resource limits. They are used for workloads with strict resource needs. They cannot burst or run over their requests, so they are the most reliable and last to be killed when a node is running low on resources.

Key Findings

Kubernetes uses CPU and memory requests for bin packing, scheduling, cluster autoscaling, and horizontal workload autoscaling. Kubernetes also uses resource requests to classify a Pod's Quality of Service (QoS) class to make decisions about which Pods should be immediately killed when a given node's utilization approaches its capacity.

Not setting requests assigns the BestEffort class to Pods by default.

Whenever Kubernetes needs to reclaim resources at node-pressure, BestEffort Pods are the first to be killed. This occurs without any warning or graceful termination, potentially causing disruption to the overall application. Burstable workloads that constantly use more memory than they have requested also cause a similar problem.

Indiscriminately deploying these kinds of workloads impacts the behavior of Kubernetes bin packing, scheduling, and autoscaling algorithms. It can also detract from end user experience, which can affect a business as a whole.

Workload Rightsizing Comes First! (or you’ll have to re-do work on cluster bin packaging and discount coverage)

Most workload resource requests are substantially over-provisioned. Even Elite performers that efficiently set memory requests have room for improvement when it comes to setting CPU requests. The research also found that workload rightsizing is the biggest opportunity to reduce resource waste.

Some clusters struggle to balance reliability and cost-efficiency.

Some performers with fairly-large clusters are trying to optimize costs, but unintentionally have too many BestEffort and memory-underprovisioned Burstable pods, so the nodes are often overloaded. Most performers didn’t know they were running so many BestEffort and Burstable Pods, or didn’t know the consequences of doing so.

Cost optimization without considering reliability impacts user experience.

Clusters with lots of BestEffort and Burstable Pods can have low bin packing (inefficiently-packed containers). Platform admins might see the low bin packing as an opportunity to save costs by scaling down the cluster or nodes, but that can cause disruption through sudden termination of Pods when the nodes get overloaded.

Demand-based downscaling needs proper workload autoscaling.

Cluster Autoscaler (CA) is not enough to make a cluster scale down during off-peak hours. To autoscale a cluster, it is necessary to properly configure workload autoscaling with Horizontal Pod Autoscaler (HPA) and Vertical Pod Autoscaler (VPA) . The more workloads admins manage to scale down during off-peak hours, the more efficiently CA can remove nodes.

Take advantage of cloud discounts.

Elite and high-performers run the largest clusters, so they tend to have better in-house Kubernetes expertise and dedicated teams focusing on cost optimization activities.This allows them to incorporate the use of significantly-discounted Spot VMs and forecast long term commitments.

Pods that don’t set requests properly compromise cost computation.

Most Kubernetes cost allocation tooling leverages resource requests to compute costs. When Pods do not set requests properly, such as when using BestEffort Pods and under-provisioned Burstable Pods, it can limit showback and chargeback accuracy. For example, even when a given BestEffort Pod consumes a large amount of CPU and memory from a cluster, no cost is attributed to the Pod because it requests no resources. These chargeback or showback solutions are more likely to attribute inaccurate costs to teams or divisions.

Best Practices for Balancing Reliability and Cost-Efficiency:

Use BestEffort Pods only for workloads that don’t need to be very reliable. Don’t just allow the BestEffort classification to be assigned by default.
Set memory requests equal to limits in all Burstable Pods. One can set upper limits for CPU because Kubernetes can throttle CPU requests whenever needed, although this can slow application performance. Setting upper limits for CPU allows applications to use idle CPU from nodes without worrying about abrupt workload termination.

Example configuration

Conclusion

In order to increase resilience and reinvest resource savings into the consumer experience, organizations running clusters in the public cloud must optimize cloud costs. Developers in organizations ramping up on Kubernetes must train on best practices, so that they can learn while they are migrating to the cloud.

In Part Two, we’ll highlight the Elite performers who are setting the best practices, the At-Risk performers who are doing (almost) everything right but still struggle, and what to learn from each. PT I