KubeCon is the annual conference on Kubernetes where industry and academic researchers share state-of-the art cloud-native software development and modern operations and deployment techniques.
Accurately estimating CPU & memory requirements for workloads is hard! So, it is common for users to over-provision pods, which leads to under-utilized clusters, and the need to scale up cluster size to accommodate workloads. Recently added in-place pod resize feature brings the ability to right-size over-provisioned pods without restarting them. This talk will illustrate how cluster autoscaler currently handles pods pending due to insufficient resources, then introduce a change to the autoscaling workflow that right-sizes over-provisioned pods, and show how it can help schedule pending pods more quickly while lowering costs & carbon footprint. Haoran will talk about the latest research that leverages machine learning and reinforcement learning techniques to achieve multi-dimensional autoscaling, and discuss how this cutting-edge work can help proactively scale workloads to achieve optimal cluster utilization while meeting application SLOs by more precisely provisioning the pods.
Haoran’s line of work on machine learning for resource management has been published at multiple conferences: FIRM (OSDI 2020), SIMPPO (SoCC 2022), AWARE (ATC 2023)
Link to the talk: https://sched.co/1R2nS