Autoscaling Microservices in Kubernetes: Vertical Scaling
The Vertical Pod Autoscaler
The VPA is actively developed by SIG-Autoscaling - the working group in charge of the Kubernetes autoscaling components (HPA, VPA, Cluster Autoscaler). However, unlike the HPA, it's not part of the Kubernetes core components and needs to be installed separately.
There are different ways to install the VPA.
Using the GitHub repository:
# clone the repository
git clone https://github.com/kubernetes/autoscaler.git
# change directory
cd autoscaler/vertical-pod-autoscaler
# install the VPA
./hack/vpa-up.sh
Using your Cloud provider: For example, if you are using GKE, you can install the VPA using the following command:
gcloud container clusters update CLUSTER_NAME \
--enable-vertical-pod-autoscaling
If you are using DigitalOcean, you can install the VPA using the following command:
doctl kubernetes cluster node-pool update CLUSTER_NAME NODE_POOL_NAME \
--auto-scale \
--min-nodes 1 \
--max-nodes 10
Or use the web console of your cloud provider: Most cloud providers have a way to enable the VPA using their web console.
We will use the first way—using the GitHub repository. Before doing that, make sure to check the compatibility matrix. Here is the latest one:
| VPA version | Kubernetes version |
|---|---|
| 1.5.x | 1.28+ (1.33+ when using InPlaceOrRecreate) |
| 1.4.x | 1.28+ (1.33+ when using InPlaceOrRecreate Alpha Feature Gate) |
| 1.3.x | 1.28+ |
| 1.2.x | 1.27+ |
| 1.1.x | 1.25+ |
| 1.0 | 1.25+ |
| 0.14 | 1.25+ |
| 0.13 | 1.25+ |
| 0.12 | 1.25+ |
| 0.11 | 1.22 - 1.24 |
| 0.10 | 1.22+ |
| 0.9 | 1.16+ |
| 0.8 | 1.13+ |
| 0.4 to 0.7 | 1.11+ |
| 0.3.x and lower | 1.7+ |
Let's clone and install the VPA:
cd /tmp
# clone the repository
git clone https://github.com/kubernetes/autoscaler
# change directory
cd autoscaler/vertical-pod-autoscaler
# Git checkout the version compatible with your Kubernetes cluster
git checkout vertical-pod-autoscaler-1.5.0
# install the VPA using this script
./hack/vpa-up.sh
The VPA is installed in the kube-system namespace. You can check the status of the VPA using the following command:
kubectl get pods -n kube-system -l app=vpa-admission-controller; echo
kubectl get pods -n kube-system -l app=vpa-recommender; echo
kubectl get pods -n kube-system -l app=vpa-updater; echo
# Or
# kubectl get all -n kube-system | grep vpa
There are three components that participate in the vertical scaling process:
vpa-admission-controllervpa-recommendervpa-updater
vpa-admission-controller
The first component is a binary that registers itself as a Mutating Admission Webhook.
It intercepts all Pod creations. Whenever a new Pod is created, this component receives a request from the API server.
It checks whether a matching VPA configuration exists.
If no matching configuration is found, it takes no action.
If a matching configuration is available, it uses the current recommendation to set the resource requests in the Pod specification.
vpa-recommender
The second component, the VPA Recommender, is responsible for generating recommendations used by the VPA Admission Controller.
- It periodically queries the Metrics API to gather current resource usage information from Pods.
- Using this data, the VPA Recommender produces recommendations for optimal CPU and memory requests.
VPA Updater
The third component, the VPA Updater, updates the requests and limits of Pods based on the recommendations provided by the VPA Admission Controller.
- It periodically polls the Kubernetes API server to detect changes in VPA configurations or Pod statuses and applies updates as needed. This guarantees that Pod resources remain appropriately tuned to match the application's current demand and usage.
Vertical Pod Autoscaler (VPA)
Once the VPA is installed, our cluster can recommend and set resource requests for our Pods. To enable automatic computation of resource requirements, we need to insert a VPA resource in our namespace.
Here's an example of how to do it:
# We assume the Namespace and Deployment
# for stateless-flask is already created
kubectl apply -f - <
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: stateless-flask-vpa
namespace: stateless-flask
spec:
targetRef:
apiVersion: "apps/v1"
kind: Deployment
name: stateless-flask
updatePolicy:
updateMode: "Auto"
EOF
Let's check the status of the VPA:
kubectl get vpa -n stateless-flask
Let's also configure our Deployment to have low initial resource requests and limits, and 2 replicas:
kubectl apply -f - <
# Namespace
apiVersion: v1
kind: Namespace
metadata:
name: stateless-flask
# Deployment
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: stateless-flask
namespace: stateless-flask
spec:
replicas: 2
selector:
matchLabels:
app: stateless-flask
template:
metadata:
labels:
app: Cloud-Native Microservices With Kubernetes - 2nd Edition
A Comprehensive Guide to Building, Scaling, Deploying, Observing, and Managing Highly-Available Microservices in KubernetesEnroll now to unlock all content and receive all future updates for free.

