Setting Up Cluster Autoscaler in Minikube for Development Testing

Introduction

Kubernetes Cluster Autoscaler automatically adjusts the number of nodes in your cluster based on pending workloads. While in production, this typically requires a cloud provider, Minikube provides a way to simulate autoscaling for development and testing.

In this guide, we’ll configure Cluster Autoscaler on Minikube, simulate scaling behaviors, and observe how it increases node capacity when needed.

The Problem: Autoscaling in Development Environments

In production, Kubernetes clusters dynamically scale nodes to handle workload spikes.
In local development, Minikube runs a single node by default, making it challenging to test Cluster Autoscaler.
Solution: Use Minikube’s multi-node feature and the Cluster Autoscaler to simulate real-world autoscaling scenarios.

Step 1: Start Minikube with Multiple Nodes

Since Minikube doesn’t support real autoscaling, we manually start it with multiple nodes to allow Cluster Autoscaler to scale between them.

minikube start --nodes 2

Verify the nodes are running:

kubectl get nodes

Step 2: Install Metrics Server

Cluster Autoscaler relies on resource metrics to make scaling decisions. Install the Metrics Server:

kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml

Verify that the Metrics Server is running:

kubectl get deployment metrics-server -n kube-system

Step 3: Deploy Cluster Autoscaler

Now, deploy the Cluster Autoscaler to monitor and scale nodes.

Cluster Autoscaler Deployment YAML

Create a file called cluster-autoscaler.yaml and add:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: cluster-autoscaler
  namespace: kube-system
  labels:
    app: cluster-autoscaler
spec:
  replicas: 1
  selector:
    matchLabels:
      app: cluster-autoscaler
  template:
    metadata:
      labels:
        app: cluster-autoscaler
    spec:
      serviceAccountName: cluster-autoscaler
      containers:
        - name: cluster-autoscaler
          image: k8s.gcr.io/autoscaling/cluster-autoscaler:v1.27.0
          command:
            - ./cluster-autoscaler
            - --v=4
            - --stderrthreshold=info
            - --cloud-provider=minikube
            - --skip-nodes-with-local-storage=false
            - --skip-nodes-with-system-pods=false
          resources:
            requests:
              cpu: 100m
              memory: 300Mi
            limits:
              cpu: 500m
              memory: 500Mi

Click Here to Copy YAML

Apply the deployment:

kubectl apply -f cluster-autoscaler.yaml

Check logs to ensure it’s running:

kubectl logs -f deployment/cluster-autoscaler -n kube-system

Step 4: Create a Workload that Triggers Scaling

Now, deploy a workload that requires more resources than currently available, forcing the Cluster Autoscaler to scale up.

Resource-Intensive Deployment YAML

Create a file high-memory-app.yaml:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: high-memory-app
spec:
  replicas: 2
  selector:
    matchLabels:
      app: high-memory-app
  template:
    metadata:
      labels:
        app: high-memory-app
    spec:
      containers:
        - name: stress
          image: polinux/stress
          command: ["stress"]
          args: ["--vm", "2", "--vm-bytes", "500M", "--timeout", "60s"]
          resources:
            requests:
              memory: "600Mi"
              cpu: "250m"
            limits:
              memory: "800Mi"
              cpu: "500m"

Click Here to Copy YAML

Apply the deployment:

kubectl apply -f high-memory-app.yaml

Check if the pods are pending:

kubectl get pods -o wide

If you see pending pods, it means the Cluster Autoscaler should trigger node scaling.

Observing Autoscaler in Action

Now, let’s check how the autoscaler responds:

kubectl get nodes
kubectl get pods -A
kubectl logs -f deployment/cluster-autoscaler -n kube-system

You should see the Cluster Autoscaler increasing the node count to accommodate the pending pods. Once the workload decreases, it should scale down unused nodes.

Why Does This Matter?

Understand autoscaler behavior before deploying to production
Validate custom scaling policies in a local development setup
Optimize resource allocation for cost and performance efficiency

Even though Minikube doesn’t create new cloud nodes dynamically, this method helps developers test scaling triggers and behaviors before running on real cloud environments.

Conclusion: Build Smarter Autoscaling Strategies

Testing Cluster Autoscaler in Minikube provides valuable insights into Kubernetes scaling before moving to production. If you’re developing autoscaling-sensitive applications, mastering this setup ensures better efficiency, cost savings, and resilience.

Have you tested autoscaling in Minikube? Drop your thoughts in the comments!