Cluster Autoscaler (CA)
Introduction
The Cluster Autoscaler (CA) is used to automatically adjust the number of Node nodes in the cluster to meet business needs.
We know that when creating a Pod, we can specify the request amount (Request) of CPU, memory, GPU and other resources for each container. The scheduler component of Kubernetes will determine which Node node to schedule the Pod to based on the Request. If there are no nodes in the cluster with sufficient idle capacity, the Pod will not be successfully created, but will remain in the Pending state until a new Node node is added to the cluster or the stock Pod is deleted to release the idle capacity.
The CA component looks for Pods that cannot be successfully scheduled and traverses the scaling groups to determine whether the new nodes expanded by the scaling group template meet the requirements. If it is determined that the newly added node can make the Pod be successfully scheduled, then CA will expand the cluster.
The CA component will also shrink the cluster. The triggering condition for the scaling down is that the Request request rate of a Node node is lower than the scale-down threshold. However, the scaling down is not carried out immediately, but waits for a period of time (default is 10 minutes). This can be modified by the --scale-down-unneeded-time parameter.
Unlike HPA, CA is not built-in, but runs in the form of Deployment in the Kubernetes cluster. UK8S already supports CA, and you can configure CA in the UK8S management interface.
Working Principle
The scale-up trigger condition for CA is there exist Pods that cannot be successfully created due to insufficient cluster resources. These resources include CPU, memory, and GPU. Taking GPU as an example, when a Pod applies for the GPU resource nvidia.com/gpu
(refer to GPU node usage document), but is in a pending state due to no GPU nodes in the cluster, CA will automatically scale up nodes in the scaling group configured with the GPU model template.
The triggering condition for CA scale-down is the node's resource request rate (Request) is below the scale-down threshold (such as 50%) for a certain period of time (default 10 minutes), and all Pods on the node can be scheduled to other nodes.
It's worth noting the condition that all Pods on the node can be scheduled to other nodes. Many students who configured CA will question why the node resource request volume is below the threshold but the scale-down is not triggered. The reason is actually simple. If there is an independent Pod running on this node (not managed by any controller), since the Pod cannot be rescheduled, to ensure the normal operation of the business, the node's scale-down will not proceed.
Using Cluster Scaling in UK8S
1. Create Scaling Configuration
2. Fill in Configuration Parameters
Usually, default values are sufficient
3. Create Scaling Group
Important. The configuration of the Node node when the cluster expansion is triggered, the scaling range is mainly used to prevent unlimited expansion due to DDos, etc.
4. Turn on Cluster Scaling
After the scaling group is created, we need to enable it. After clicking the enabling operation, your UK8S cluster will have a Cluster-Autoscaler Deployment. If you manually delete this Deployment, it will cause the cluster scaling to fail to work normally. You need to close it on the cluster scaling page first, then restart it to trigger re-creation.
CA Parameter Description
CA itself has many command parameters, which can adjust some scaling behaviors. This can be adjusted by changing the args
parameters of the CA deployment.
Below are some CA parameters and descriptions:
Parameter | Type | Default Value | Explanation |
---|---|---|---|
scale-down-delay-after-add | Duration | 10min | Delay for scale-down after expansion. |
scale-down-delay-after-delete | Duration | Same as scanning interval | Delay for scale-down after node deletion. |
scale-down-unneeded-time | Duration | 10min | Time to scale-down after node is marked as unneeded. |
node-deletion-delay-timeout | Duration | 2min | CA's timeout for waiting for node deletion to complete. |
scan-interval | Duration | 10s | Time interval for each scale-in scan. |
max-nodes-total | int | 0 | Maximum scaling node quantity. |
cores-total | String | [0:32E+04] | CPU core scaling range of the cluster. |
memory-total | String | [0:64E+05] | Memory scaling range of the cluster. |