Achieving Second-level Elastic Scaling with tke-autoscaling-placeholder

Operation Scenario

If a TKE cluster is configured with node pools and elastic scaling is enabled, automatic node expansion can be triggered when node resources are insufficient (automatically purchasing machines and joining the cluster). This expansion process requires some time to complete. In scenarios with sudden traffic spikes, this expansion speed may appear too slow, affecting normal business operations. tke-autoscaling-placeholder can be used on TKE to achieve second-level scaling to handle sudden traffic spike scenarios. This article describes how to use tke-autoscaling-placeholder to achieve second-level elastic scaling.

Implementation Principle

tke-autoscaling-placeholder uses low-priority Pods (pause containers with requests, actual resource consumption is low) to pre-allocate resources, reserving a portion of resources as a buffer for potentially traffic-spiking high-priority businesses. When Pod scaling is needed, high-priority Pods can quickly preempt resources from low-priority Pods for scheduling, which will cause low-priority tke-autoscaling-placeholder Pods to enter Pending state. If node pools are configured and elastic scaling is enabled, node expansion will be triggered. Through this resource buffering mechanism, even if node expansion is slow, it ensures that some Pods can be quickly scaled and scheduled, achieving second-level scaling. According to actual needs, you can adjust the request or replica count of tke-autoscaling-placeholder to adjust the reserved buffer resource amount.

Usage Limitations

Using the tke-autoscaling-placeholder application requires cluster version 1.18 or above.

Operation Steps

Installing tke-autoscaling-placeholder

Log in to the container service console and select Application Market in the left navigation bar.
On the Application Market page, enter the keyword tke-autoscaling-placeholder to search and find the application. As shown below:
Click the application, and in the application details, click Create Application in the basic information module.
On the Create Application page, configure and create the application as needed. As shown below: Configuration description:

Name: Enter application name. Maximum 63 characters, can only contain lowercase letters, numbers, and separator "-", and must start with lowercase letter, end with number or lowercase letter.
Region: Select deployment region.
Cluster Type: Select Standard Cluster.
Cluster: Select cluster ID to deploy.
Namespace: Select namespace to deploy.
Chart Version: Select chart version to deploy.
Parameters: The most important parameters in configuration are replicaCount and resources.request, representing the replica count of tke-autoscaling-placeholder and the resource size occupied by each replica. Together they determine the buffer resource size, which can be estimated and set based on the additional resources needed for traffic spikes. Complete parameter configuration description for tke-autoscaling-placeholder:

Parameter Name	Description	Default Value
replicaCount	placeholder replica count	10
image	placeholder image address	ccr.ccs.tencentyun.com/tke-market/pause:latest
resources.requests.cpu	CPU resource size occupied by single placeholder replica	300m
resources.requests.memory	Memory size occupied by single placeholder replica	600Mi
lowPriorityClass.create	Whether to create low-priority PriorityClass (referenced by placeholder)	true
lowPriorityClass.name	Low-priority PriorityClass name	low-priority
nodeSelector	Specify placeholder to be scheduled to nodes with specific labels
tolerations	Specify taints placeholder should tolerate	[]
affinity	Specify placeholder affinity configuration

Click Create to deploy the tke-autoscaling-placeholder application.

Execute the following command to check if resource-occupying Pods started successfully.

kubectl get pod -n default

Example:

$ kubectl get pod -n default
tke-autoscaling-placeholder-b58fd9d5d-2p6ww   1/1     Running   0          8s
tke-autoscaling-placeholder-b58fd9d5d-55jw7   1/1     Running   0          8s
tke-autoscaling-placeholder-b58fd9d5d-6rq9r   1/1     Running   0          8s
tke-autoscaling-placeholder-b58fd9d5d-7c95t   1/1     Running   0          8s
tke-autoscaling-placeholder-b58fd9d5d-bfg8r   1/1     Running   0          8s
tke-autoscaling-placeholder-b58fd9d5d-cfqt6   1/1     Running   0          8s
tke-autoscaling-placeholder-b58fd9d5d-gmfmr   1/1     Running   0          8s
tke-autoscaling-placeholder-b58fd9d5d-grwlh   1/1     Running   0          8s
tke-autoscaling-placeholder-b58fd9d5d-ph7vl   1/1     Running   0          8s
tke-autoscaling-placeholder-b58fd9d5d-xmrmv   1/1     Running   0          8s

Deploying High-Priority Pods

tke-autoscaling-placeholder has low priority by default. Business Pods can specify a high-priority PriorityClass to facilitate resource preemption for rapid scaling. If PriorityClass hasn't been created, you can refer to the following example:

apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
  name: high-priority
value: 1000000
globalDefault: false
description: "high priority class"

Specify priorityClassName as high-priority PriorityClass in business Pods. Example:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx
spec:
  replicas: 8
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      priorityClassName: high-priority # Specify high-priority PriorityClass here
      containers:
      - name: nginx
        image: nginx
        resources:
          requests:
            cpu: 400m
            memory: 800Mi

When cluster node resources are insufficient, scaled high-priority business Pods can preempt resources from low-priority tke-autoscaling-placeholder Pods and be scheduled. At this point, tke-autoscaling-placeholder Pod status will become Pending. Example:

$ kubectl get pod -n default
NAME                                          READY   STATUS    RESTARTS   AGE
nginx-bf79bbc8b-5kxcw                         1/1     Running   0          23s
nginx-bf79bbc8b-5xhbx                         1/1     Running   0          23s
nginx-bf79bbc8b-bmzff                         1/1     Running   0          23s
nginx-bf79bbc8b-l2vht                         1/1     Running   0          23s
nginx-bf79bbc8b-q84jq                         1/1     Running   0          23s
nginx-bf79bbc8b-tq2sx                         1/1     Running   0          23s
nginx-bf79bbc8b-tqgxg                         1/1     Running   0          23s
nginx-bf79bbc8b-wz5w5                         1/1     Running   0          23s
tke-autoscaling-placeholder-b58fd9d5d-255r8   0/1     Pending   0          23s
tke-autoscaling-placeholder-b58fd9d5d-4vt8r   0/1     Pending   0          23s
tke-autoscaling-placeholder-b58fd9d5d-55jw7   1/1     Running   0          94m
tke-autoscaling-placeholder-b58fd9d5d-7c95t   1/1     Running   0          94m
tke-autoscaling-placeholder-b58fd9d5d-ph7vl   1/1     Running   0          94m
tke-autoscaling-placeholder-b58fd9d5d-qjrsx   0/1     Pending   0          23s
tke-autoscaling-placeholder-b58fd9d5d-t5qdm   0/1     Pending   0          23s
tke-autoscaling-placeholder-b58fd9d5d-tgvmw   0/1     Pending   0          23s
tke-autoscaling-placeholder-b58fd9d5d-xmrmv   1/1     Running   0          94m
tke-autoscaling-placeholder-b58fd9d5d-zxtwp   0/1     Pending   0          23s

If node pool elastic scaling is configured, node expansion will be triggered. Although node speed is slow, since buffer resources have been allocated to business Pods, businesses can quickly scale, thus not affecting normal business operations.

Summary

This article introduced the tool tke-autoscaling-placeholder for achieving second-level scaling, cleverly using Pod priority and preemption characteristics to pre-deploy some low-priority "empty Pods" for resource allocation as buffer resource filling. In scenarios with sudden traffic spikes and insufficient cluster resources, preempt resources from these low-priority "empty Pods" while triggering node expansion, achieving second-level scaling even in resource-constrained situations without affecting normal business operations.

Operation Scenario​

Implementation Principle​

Usage Limitations​

Operation Steps​

Installing tke-autoscaling-placeholder​

Deploying High-Priority Pods​

Summary​

Related Documentation​