Achieving Second-level Elastic Scaling with tke-autoscaling-placeholder
Operation Scenario
If a TKE cluster is configured with node pools and elastic scaling is enabled, automatic node expansion can be triggered when node resources are insufficient (automatically purchasing machines and joining the cluster). This expansion process requires some time to complete. In scenarios with sudden traffic spikes, this expansion speed may appear too slow, affecting normal business operations. tke-autoscaling-placeholder can be used on TKE to achieve second-level scaling to handle sudden traffic spike scenarios. This article describes how to use tke-autoscaling-placeholder to achieve second-level elastic scaling.
Implementation Principle
tke-autoscaling-placeholder uses low-priority Pods (pause containers with requests, actual resource consumption is low) to pre-allocate resources, reserving a portion of resources as a buffer for potentially traffic-spiking high-priority businesses. When Pod scaling is needed, high-priority Pods can quickly preempt resources from low-priority Pods for scheduling, which will cause low-priority tke-autoscaling-placeholder Pods to enter Pending state. If node pools are configured and elastic scaling is enabled, node expansion will be triggered. Through this resource buffering mechanism, even if node expansion is slow, it ensures that some Pods can be quickly scaled and scheduled, achieving second-level scaling. According to actual needs, you can adjust the request or replica count of tke-autoscaling-placeholder to adjust the reserved buffer resource amount.
Usage Limitations
Using the tke-autoscaling-placeholder application requires cluster version 1.18 or above.
Operation Steps
Installing tke-autoscaling-placeholder
- Log in to the container service console and select Application Market in the left navigation bar.
- On the Application Market page, enter the keyword tke-autoscaling-placeholder to search and find the application. As shown below:

- Click the application, and in the application details, click Create Application in the basic information module.
- On the Create Application page, configure and create the application as needed. As shown below:
Configuration description:
- Name: Enter application name. Maximum 63 characters, can only contain lowercase letters, numbers, and separator "-", and must start with lowercase letter, end with number or lowercase letter.
- Region: Select deployment region.
- Cluster Type: Select Standard Cluster.
- Cluster: Select cluster ID to deploy.
- Namespace: Select namespace to deploy.
- Chart Version: Select chart version to deploy.
- Parameters: The most important parameters in configuration are
replicaCountandresources.request, representing the replica count oftke-autoscaling-placeholderand the resource size occupied by each replica. Together they determine the buffer resource size, which can be estimated and set based on the additional resources needed for traffic spikes. Complete parameter configuration description fortke-autoscaling-placeholder:
| Parameter Name | Description | Default Value |
| replicaCount | placeholder replica count | 10 |
| image | placeholder image address | ccr.ccs.tencentyun.com/tke-market/pause:latest |
| resources.requests.cpu | CPU resource size occupied by single placeholder replica | 300m |
| resources.requests.memory | Memory size occupied by single placeholder replica | 600Mi |
| lowPriorityClass.create | Whether to create low-priority PriorityClass (referenced by placeholder) | true |
| lowPriorityClass.name | Low-priority PriorityClass name | low-priority |
| nodeSelector | Specify placeholder to be scheduled to nodes with specific labels | |
| tolerations | Specify taints placeholder should tolerate | [] |
| affinity | Specify placeholder affinity configuration |
-
Click Create to deploy the tke-autoscaling-placeholder application.
-
Execute the following command to check if resource-occupying Pods started successfully.
kubectl get pod -n defaultExample:
$ kubectl get pod -n default
tke-autoscaling-placeholder-b58fd9d5d-2p6ww 1/1 Running 0 8s
tke-autoscaling-placeholder-b58fd9d5d-55jw7 1/1 Running 0 8s
tke-autoscaling-placeholder-b58fd9d5d-6rq9r 1/1 Running 0 8s
tke-autoscaling-placeholder-b58fd9d5d-7c95t 1/1 Running 0 8s
tke-autoscaling-placeholder-b58fd9d5d-bfg8r 1/1 Running 0 8s
tke-autoscaling-placeholder-b58fd9d5d-cfqt6 1/1 Running 0 8s
tke-autoscaling-placeholder-b58fd9d5d-gmfmr 1/1 Running 0 8s
tke-autoscaling-placeholder-b58fd9d5d-grwlh 1/1 Running 0 8s
tke-autoscaling-placeholder-b58fd9d5d-ph7vl 1/1 Running 0 8s
tke-autoscaling-placeholder-b58fd9d5d-xmrmv 1/1 Running 0 8s
Deploying High-Priority Pods
tke-autoscaling-placeholder has low priority by default. Business Pods can specify a high-priority PriorityClass to facilitate resource preemption for rapid scaling. If PriorityClass hasn't been created, you can refer to the following example:
apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
name: high-priority
value: 1000000
globalDefault: false
description: "high priority class"
Specify priorityClassName as high-priority PriorityClass in business Pods. Example:
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx
spec:
replicas: 8
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
priorityClassName: high-priority # Specify high-priority PriorityClass here
containers:
- name: nginx
image: nginx
resources:
requests:
cpu: 400m
memory: 800Mi
When cluster node resources are insufficient, scaled high-priority business Pods can preempt resources from low-priority tke-autoscaling-placeholder Pods and be scheduled. At this point, tke-autoscaling-placeholder Pod status will become Pending. Example:
$ kubectl get pod -n default
NAME READY STATUS RESTARTS AGE
nginx-bf79bbc8b-5kxcw 1/1 Running 0 23s
nginx-bf79bbc8b-5xhbx 1/1 Running 0 23s
nginx-bf79bbc8b-bmzff 1/1 Running 0 23s
nginx-bf79bbc8b-l2vht 1/1 Running 0 23s
nginx-bf79bbc8b-q84jq 1/1 Running 0 23s
nginx-bf79bbc8b-tq2sx 1/1 Running 0 23s
nginx-bf79bbc8b-tqgxg 1/1 Running 0 23s
nginx-bf79bbc8b-wz5w5 1/1 Running 0 23s
tke-autoscaling-placeholder-b58fd9d5d-255r8 0/1 Pending 0 23s
tke-autoscaling-placeholder-b58fd9d5d-4vt8r 0/1 Pending 0 23s
tke-autoscaling-placeholder-b58fd9d5d-55jw7 1/1 Running 0 94m
tke-autoscaling-placeholder-b58fd9d5d-7c95t 1/1 Running 0 94m
tke-autoscaling-placeholder-b58fd9d5d-ph7vl 1/1 Running 0 94m
tke-autoscaling-placeholder-b58fd9d5d-qjrsx 0/1 Pending 0 23s
tke-autoscaling-placeholder-b58fd9d5d-t5qdm 0/1 Pending 0 23s
tke-autoscaling-placeholder-b58fd9d5d-tgvmw 0/1 Pending 0 23s
tke-autoscaling-placeholder-b58fd9d5d-xmrmv 1/1 Running 0 94m
tke-autoscaling-placeholder-b58fd9d5d-zxtwp 0/1 Pending 0 23s
If node pool elastic scaling is configured, node expansion will be triggered. Although node speed is slow, since buffer resources have been allocated to business Pods, businesses can quickly scale, thus not affecting normal business operations.
Summary
This article introduced the tool tke-autoscaling-placeholder for achieving second-level scaling, cleverly using Pod priority and preemption characteristics to pre-deploy some low-priority "empty Pods" for resource allocation as buffer resource filling. In scenarios with sudden traffic spikes and insufficient cluster resources, preempt resources from these low-priority "empty Pods" while triggering node expansion, achieving second-level scaling even in resource-constrained situations without affecting normal business operations.