The Vertical Pod Autoscaler

The VPA is actively developed by SIG-Autoscaling - the working group in charge of the Kubernetes autoscaling components (HPA, VPA, Cluster Autoscaler). However, unlike the HPA, it's not part of the Kubernetes core components and needs to be installed separately.

There are different ways to install the VPA.

Using the GitHub repository:

# clone the repository
git clone https://github.com/kubernetes/autoscaler.git

# change directory
cd autoscaler/vertical-pod-autoscaler

# install the VPA
./hack/vpa-up.sh

Using your Cloud provider: For example, if you are using GKE, you can install the VPA using the following command:

gcloud container clusters update CLUSTER_NAME \
  --enable-vertical-pod-autoscaling

If you are using DigitalOcean, you can install the VPA using the following command:

doctl kubernetes cluster node-pool update CLUSTER_NAME NODE_POOL_NAME \
  --auto-scale \
  --min-nodes 1 \
  --max-nodes 10

Or use the web console of your cloud provider: Most cloud providers have a way to enable the VPA using their web console.

We will use the first way—using the GitHub repository. Before doing that, make sure to check the compatibility matrix. Here is the latest one:

VPA version	Kubernetes version
1.5.x	1.28+ (1.33+ when using `InPlaceOrRecreate`)
1.4.x	1.28+ (1.33+ when using `InPlaceOrRecreate` Alpha Feature Gate)
1.3.x	1.28+
1.2.x	1.27+
1.1.x	1.25+
1.0	1.25+
0.14	1.25+
0.13	1.25+
0.12	1.25+
0.11	1.22 - 1.24
0.10	1.22+
0.9	1.16+
0.8	1.13+
0.4 to 0.7	1.11+
0.3.x and lower	1.7+

Let's clone and install the VPA:

cd /tmp
# clone the repository
git clone https://github.com/kubernetes/autoscaler

# change directory
cd autoscaler/vertical-pod-autoscaler

# Git checkout the version compatible with your Kubernetes cluster
git checkout vertical-pod-autoscaler-1.5.0

# install the VPA using this script
./hack/vpa-up.sh

The VPA is installed in the kube-system namespace. You can check the status of the VPA using the following command:

kubectl get pods -n kube-system -l app=vpa-admission-controller; echo
kubectl get pods -n kube-system -l app=vpa-recommender; echo
kubectl get pods -n kube-system -l app=vpa-updater; echo
# Or
# kubectl get all -n kube-system | grep vpa

There are three components that participate in the vertical scaling process:

vpa-admission-controller
vpa-recommender
vpa-updater

vpa-admission-controller

The first component is a binary that registers itself as a Mutating Admission Webhook.

It intercepts all Pod creations. Whenever a new Pod is created, this component receives a request from the API server.
It checks whether a matching VPA configuration exists.
If no matching configuration is found, it takes no action.
If a matching configuration is available, it uses the current recommendation to set the resource requests in the Pod specification.

vpa-recommender

The second component, the VPA Recommender, is responsible for generating recommendations used by the VPA Admission Controller.

It periodically queries the Metrics API to gather current resource usage information from Pods.
Using this data, the VPA Recommender produces recommendations for optimal CPU and memory requests.

VPA Updater

The third component, the VPA Updater, updates the requests and limits of Pods based on the recommendations provided by the VPA Admission Controller.

It periodically polls the Kubernetes API server to detect changes in VPA configurations or Pod statuses and applies updates as needed. This guarantees that Pod resources remain appropriately tuned to match the application's current demand and usage.

Vertical Pod Autoscaler (VPA)

Once the VPA is installed, our cluster can recommend and set resource requests for our Pods. To enable automatic computation of resource requirements, we need to insert a VPA resource in our namespace.

Here's an example of how to do it:

# We assume the Namespace and Deployment
# for stateless-flask is already created
kubectl apply -f - <
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: stateless-flask-vpa
  namespace: stateless-flask
spec:
    targetRef:
        apiVersion: "apps/v1"
        kind:       Deployment
        name:       stateless-flask
    updatePolicy:
        updateMode: "Auto"
EOF

Let's check the status of the VPA:

kubectl get vpa -n stateless-flask

Let's also configure our Deployment to have low initial resource requests and limits, and 2 replicas:

kubectl apply -f - <
# Namespace
apiVersion: v1
kind: Namespace
metadata:
  name: stateless-flask
# Deployment
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: stateless-flask
  namespace: stateless-flask
spec:
  replicas: 2
  selector:
    matchLabels:
      app: stateless-flask
  template:
    metadata:
      labels:
        app:

Cloud-Native Microservices With Kubernetes - 2nd Edition

A Comprehensive Guide to Building, Scaling, Deploying, Observing, and Managing Highly-Available Microservices in Kubernetes

Enroll now to unlock all content and receive all future updates for free.

Unlock now $31.99 Learn More

Previous Next