• Glenrose Group
  • Posts
  • Highlights of State of Kubernetes Cost Optimization Report - Part 2

Highlights of State of Kubernetes Cost Optimization Report - Part 2

We wrap up our review of Google's K8s cost optimization report.

Highlights Part Two

In Part One of the highlights from the Google Kubernetes Cost Optimization Report (Read here), we learned how the report used four “Golden Signals” to segment clusters into different levels of cost optimization performance: Elite, High, Medium, Low, and At Risk. The At Risk segment has the highest probability across all segments to negatively impact end user experience. On the other hand, High and Elite performers can save costs by running lower priced workloads in the public cloud by adopting a continuous practice to measure and improve Kubernetes cost optimization golden signals. To meet the demand of cost-efficient Kubernetes clusters without compromising reliability and performance, developers must know what to do and what to avoid.

What are At Risk Performers doing wrong?

At Risk segments have clusters where the sum of actual resource utilization is generally higher than the sum of their workloads' requested resources. This is caused by Pods that use more resources than they have requested, such as BestEffort Pods and memory under- provisioned Burstable Pods. These result in an increased risk of intermittent and hard-to-debug reliability issues caused by the way Kubernetes reclaims resources at node-pressure.

The default Kubernetes behavior prefers to schedule incoming Pods on nodes with low bin packing, but the bin packing algorithm doesn't take into account actual resource utilization, it only considers resources requested. The Kubernetes scheduler then continues to schedule incoming BestEffort and under-provisioned Burstable Pods on a few low bin packed nodes, causing these nodes to have higher than requested utilization. This can then trigger the kubelet "self-defense mode", terminating Pods immediately to reclaim the starved resources, without any warning or graceful termination.

To avoid time spent debugging intermittent and hard-to-predict application errors that impair end user experience, use both BestEffort Pods and memory under-provisioned Burstable Pods with caution. Because they can be killed by the kubelet whenever a node is under pressure, application developers and operators must fully understand the consequences of running them.

When to use BestEffort and memory-underprovisioned Burstable Pods?

Both BestEffort Pods and memory-underprovisioned Burstable Pods remain useful for utilizing the temporary idle capacity from a cluster, but developers should adopt these kinds of Pods only for best effort workloads that can be killed at any moment, without any eviction time.

What are Elite performers doing right?

Because Elite performers surpass the other segments on both the golden signals and the adoption of cost optimization features (e.g. Cluster Autoscaler, Horizontal Pod Autoscaler, cost allocation tools, etc), their teams presumably have a better understanding of the Kubernetes platform and are better prepared to forecast long term commitments. These capabilities make it possible for them to run considerably lower priced workloads compared to other segments.

High and Elite performers run the largest clusters across all segments. Larger clusters allow for better utilization of resources. For example, while a few workloads are idle, other workloads can be bursting. Large clusters also tend to be multi-tenant clusters. The higher the number of tenants, the more operational challenges need to be managed. Additionally, the community has seen a rapid increase in deployments of databases and other stateful workloads despite the challenges of building and operating data-centric applications on Kubernetes. High and Elite performers, who run 1.8x and 2x more StatefulSets than Low performers, are the main drivers of this increase.

The Data on Kubernetes 2022 research shows the benefits of running data-centric applications include:

  • ease of scalability

  • ability to standardize management of workloads

  • consistent environments from development to production

  • ease of maintenance

  • improved security

  • improved use of resources

  • co-location of latency sensitive workloads

  • ease of deployment

Realizing these benefits requires specialized platform teams that need to put in place company cost optimization policies, best practices, visibility, and guardrails.

Cost Optimization Policies

High and Elite performers enable more cluster level cost optimization features than Low performers, such as Cluster Autoscaling and cost allocation tooling. While Cluster Autoscaler allows clusters to automatically add and remove nodes as needed, cost allocation tools enable companies to implement showback or chargeback processes to allocate or even bill the costs associated with each department's or division's usage from multi-tenant Kubernetes clusters.

High and Elite performers use two main approaches to enable clusters to scale down workloads according to demand during off-peak hours. First, the High and Elite performers adopted workload autoscaling APIs (e.g. HPA and VPA in Auto/Init mode) more than Low performers. Second, High and Elite clusters run more jobs to completion than Low performers.

The adoption of workload autoscaling and jobs to completion is much more important for downscaling a cluster than optimizing Cluster Autoscaler for a faster and more aggressive scale down. In other words, if developers don't properly configure their workloads to scale according to demand, platform teams are limited in their ability to scale down their cluster.

All segments, including Elite performers, could improve demand-based autoscaling efficiency by increasing the adoption of HPA and VPA.

Best Practices

High and Elite segments adhere to workload best practices. For example, developers from High and Elite performers rightsize their workloads better than Low performers, and deploy fewer BestEffort Pods than clusters in the At Risk segment, respectively. These results reflect developers' better knowledge of their applications and demonstrate the value of specialized platform teams that can build tools and custom solutions. As a result, the High and Elite segments can continuously enforce policies, provide golden paths, optimize rightsizing (e.g. VPA at recommendation mode deployed 12x and 16.1x more by these segments, respectively, than Low performers) and provide best practices recommendations to developer teams.

(See Part 1 to read more about Best Practices for Balancing Reliability and Cost-Efficiency)

Visibility

To avoid being billed for resources that you don't need, observability is critical; you can’t manage what you can't see.

DevOps and platform admins should provide application owners with rightsizing recommendations while highlighting and tracking the use of workloads that are at reliability risk, such as all BestEffort Pods and memory under-provisioned Burstable Pods. It is important to adopt dashboards, warnings, alerts, and actionable strategies such as automatic issue ticketing or automatic pull requests with actual recommendations. These strategies are even more effective when integrated into the development work streams, such as into developer IDEs, developer portals, CI/CD pipelines, etc. Such alternatives are useful, not only for improving reliability of the overall applications, but also for building a more cost-conscious culture.

Guardrails

DevOps and platform teams can build solutions to enforce the best practices discussed in this section using standard Kubernetes constructs, such as validation and mutation webhooks, or by using policy controller frameworks, such as Open Policy Agent (OPA) Gatekeeper. To allow teams that understand Kubernetes QoS and its consequences to take advantage of idle cluster resources, it is important to have an in-place break glass process.

When a team has prepared workloads for abrupt termination and they fully understand the consequences of running BestEffort Pods, there is an opportunity to utilize idle cluster resources and lower costs by running BestEffort workloads on Spot VMs. Spot VMs are often offered at a discounted price in exchange for cloud providers being allowed to terminate and reclaim resources on short notice. Platform teams can use a mutation webhook to append a node affinity preference to Spot VMs on all Pods not setting requests. This leaves room on standard nodes for workloads that require greater reliability. Once an organization's policy or automation pipeline has integrated this strategy, standard VMs are used if no Spot VMs are provisioned.

If the platform team doesn't want to allow best practices to be bypassed, the recommendation is to validate and reject non-compliant workloads. For workloads that are tolerant to graceful restarts, an alternative is to either recommend or enforce the adoption of Vertical Pod Autoscaler using the Auto mode.

The Kubernetes LimitRange API can also set defaults for container resources. Because defaults can result in your workload becoming either under- or over-provisioned, this should not replace the recommendation of adopting VPA for rightsizing Pods. The benefit of using defaults for resources is that resources can be applied when the workload is first deployed, while VPA is still in LowConfidence mode. In this mode, VPA does not update Pod resources due to not having enough data to make a confident decision.

Configuring LimitRange in K8s

Conclusion

By following the path laid out by the Golden Signals, organizations can improve cost-efficiency without sacrificing reliability. The 2023 Google State of Kubernetes Cost Optimization Report shows what separates the highest performers from those that are struggling to use the complexity of Kubernetes to their advantage.