The true power of Kubernetes comes with autoscaling implemented by the HPA, which is a dedicated controller backed by the HorizontalPodAutoscaler API object. At a high level, the goal of the HPA is to automatically scale the number of replicas in a Deployment or StatefulSet depending on the current CPU utilization or other custom metrics (including multiple metrics at once). The details of the algorithm that determines the target number of replicas based on metric values can be found at https://kubernetes.io/docs/tasks/run-application/horizontal-Pod-autoscale/#algorithm-details. HPAs are highly configurable and in this book, we will cover a standard scenario for when we would like to autoscale based on target CPU usage.
Our voting application ...