HPA tự tăng/giảm số replicas dựa trên metrics như CPU, memory hoặc custom/external metrics. Nó cần Deployment/scale target, metrics pipeline và requests phù hợp để tính utilization có nghĩa.
Ví dụ:
kubectl autoscale deployment api --cpu-percent=70 --min=2 --max=10HPA không tự giải quyết cold start, DB bottleneck hoặc queue backlog nếu metric sai.
Với queue-based workloads, custom metrics như queue length thường tốt hơn CPU.
HPA automatically increases/decreases replicas based on metrics such as CPU, memory or custom/external metrics. It needs a Deployment/scale target, metrics pipeline and suitable requests so utilization is meaningful.
Example:
kubectl autoscale deployment api --cpu-percent=70 --min=2 --max=10HPA does not solve cold starts, database bottlenecks or queue backlog when the metric is wrong.
For queue-based workloads, custom metrics such as queue length are often better than CPU.