Managing Kubernetes Costs: Practical Techniques for Resource Optimization

Introduction

Kubernetes provides powerful scalability, but if not managed properly, cloud costs can quickly escalate. To maintain cost efficiency while ensuring performance, organizations must monitor and optimize their Kubernetes workloads effectively.

This guide covers practical techniques with YAML configurations and commands to help you reduce Kubernetes costs, right-size workloads, and optimize scaling.

Step 1: Monitor Costs and Resource Utilization

Enable Kubernetes Resource Requests and Limits

Setting resource requests and limits ensures that workloads use only the necessary CPU and memory.

Resource Requests and Limits Example (deployment.yaml)

apiVersion: apps/v1
kind: Deployment
metadata:
  name: optimized-app
  namespace: cost-optimization
spec:
  replicas: 3
  selector:
    matchLabels:
      app: optimized-app
  template:
    metadata:
      labels:
        app: optimized-app
    spec:
      containers:
      - name: app
        image: myapp:latest
        resources:
          requests:
            cpu: "250m"
            memory: "256Mi"
          limits:
            cpu: "500m"
            memory: "512Mi"
Click Here to Copy YAML

Apply the deployment:

kubectl apply -f deployment.yaml

Step 2: Optimize Kubernetes Workloads

Right-Size Pods and Nodes with Auto-Scaling

Enable Horizontal Pod Autoscaler (HPA)

Automatically scale pods based on CPU usage.

kubectl autoscale deployment optimized-app --cpu-percent=50 --min=2 --max=10

Enable Vertical Pod Autoscaler (VPA)

Dynamically adjust pod resource requests.

VPA Configuration (vpa.yaml)

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: optimized-app-vpa
  namespace: cost-optimization
spec:
  targetRef:
    apiVersion: "apps/v1"
    kind: Deployment
    name: optimized-app
  updatePolicy:
    updateMode: "Auto"
Click Here to Copy YAML

Apply VPA:

kubectl apply -f vpa.yaml

Step 3: Implement Cost-Effective Scaling Strategies

Enable Cluster Autoscaler

Ensure your cluster scales up/down based on workload demand.

For Minikube:

minikube addons enable cluster-autoscaler

For a production cluster (on AWS or GCP), configure Cluster Autoscaler based on your cloud provider’s managed service.

Step 4: Optimize Storage and Networking Costs

Use Efficient Storage Classes

Select appropriate StorageClasses to avoid overpaying for unnecessary IOPS or performance levels.

Example of a cost-effective StorageClass (storage.yaml)

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: cost-efficient-storage
provisioner: kubernetes.io/aws-ebs
parameters:
  type: gp3  # Use gp3 instead of gp2 for cost savings
Click Here to Copy YAML

Apply the StorageClass:

kubectl apply -f storage.yaml

Step 5: Remove Unused Resources

Identify and Delete Unused Resources

Find and remove unused PersistentVolumes, LoadBalancers, and Unused Pods.

List unused Persistent Volumes:

kubectl get pv | grep "Released"

Delete Unused Load Balancers:

kubectl delete svc <service-name>

Conclusion

By implementing resource requests and limits, auto-scaling, cost-efficient storage, and monitoring unused resources, you can significantly reduce Kubernetes costs without sacrificing performance.

Start optimizing your Kubernetes costs today and take control of your cloud expenses!

What strategies do you use for Kubernetes cost optimization? Let’s discuss in the comments! 👇

Building a Complete Developer Platform on Kubernetes

Introduction

A Developer Platform on Kubernetes empowers developers with self-service deployments, while platform teams maintain security, governance, and operational stability.

In this guide, we will:

  • Set up a Kubernetes cluster
  • Create namespaces for teams
  • Implement RBAC for security
  • Enable GitOps-based deployments
  • Set up monitoring and logging

Step 1: Set Up a Kubernetes Cluster

For local testing, use Minikube or kind:

minikube start --cpus 4 --memory 8g

For production, use a self-managed Kubernetes cluster.

Ensure kubectl and helm are installed:

kubectl version --client
helm version

Step 2: Create Namespaces for Teams

We create isolated namespaces for different teams to deploy applications.

Namespace YAML

apiVersion: v1
kind: Namespace
metadata:
  name: dev-team
---
apiVersion: v1
kind: Namespace
metadata:
  name: qa-team
Click Here to Copy YAML

Apply Namespace Configuration

kubectl apply -f namespaces.yaml

Step 3: Implement Role-Based Access Control (RBAC)

RBAC ensures developers have limited access to their namespaces.

RBAC YAML

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  namespace: dev-team
  name: developer-role
rules:
  - apiGroups: [""]
    resources: ["pods", "deployments", "services"]
    verbs: ["get", "list", "create", "update", "delete"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  namespace: dev-team
  name: developer-binding
subjects:
  - kind: User
    name: dev-user
    apiGroup: rbac.authorization.k8s.io
roleRef:
  kind: Role
  name: developer-role
  apiGroup: rbac.authorization.k8s.io
Click Here to Copy YAML

Apply RBAC Configuration

kubectl apply -f rbac.yaml

Step 4: Deploy an Application

Developers can now deploy applications in their namespace.

Deployment YAML

apiVersion: apps/v1
kind: Deployment
metadata:
  name: sample-app
  namespace: dev-team
spec:
  replicas: 2
  selector:
    matchLabels:
      app: sample-app
  template:
    metadata:
      labels:
        app: sample-app
    spec:
      containers:
        - name: sample-app
          image: nginx
          ports:
            - containerPort: 80
Click Here to Copy YAML

Apply Deployment

kubectl apply -f deployment.yaml

Step 5: Set Up GitOps for Continuous Deployment

We use ArgoCD for GitOps-based automated deployments.

Install ArgoCD

kubectl create namespace argocd
kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml

Step 6: Configure Monitoring & Logging

To enable observability, we set up Prometheus & Grafana.

Install Prometheus & Grafana using Helm

helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm install monitoring prometheus-community/kube-prometheus-stack --namespace monitoring --create-namespace

Access Grafana

kubectl port-forward svc/monitoring-grafana 3000:80 -n monitoring

Login credentials:

  • Username: admin
  • Password: Run this command to get it:
kubectl get secret --namespace monitoring monitoring-grafana -o jsonpath="{.data.admin-password}" | base64 --decode

Conclusion

  • Developers get self-service access to deploy apps.
  • RBAC ensures security and access control.
  • GitOps with ArgoCD enables automated deployments.
  • Monitoring stack ensures observability.

By building this Internal Developer Platform on Kubernetes, we empower developers while maintaining operational control.

How do you manage developer workflows in Kubernetes? Drop your thoughts below! 👇

Blue-Green Deployments in Kubernetes: A Step-by-Step Guide

Introduction

In today’s fast-paced DevOps world, zero-downtime deployments are a necessity. Traditional deployments often result in downtime, failed updates, or rollback complexities. Enter Blue-Green Deployments, a strategy that eliminates these risks by maintaining two production environments—one active (Blue) and one staged (Green).

In this post, we’ll walk through how to implement a Blue-Green Deployment in Kubernetes while ensuring:
✅ Zero-downtime application updates
✅ Instant rollback in case of failure
✅ Seamless traffic switching between versions

Since you don’t have access to Docker Hub or GitHub for image pulls, we’ll build and use a local image with Minikube instead.

How Blue-Green Deployment Works

Blue (Active Version): The currently running stable application version.
Green (New Version): The newly deployed application version, tested before switching live traffic.
Traffic Switching: Once the Green version is verified, traffic is routed to it, making it the new Blue version.
Rollback Option: If issues arise, traffic can be switched back to the previous version instantly.

Setting Up Blue-Green Deployment in Kubernetes

Step 1: Build the Application Image Locally

Since you don’t have access to external image registries, we’ll use Minikube’s local image repository for this setup.

First, create a simple Flask-based web app to simulate different versions:

Create app.py

from flask import Flask

app = Flask(__name__)

@app.route("/")
def home():
    return "Hello from the Blue Version!"

if __name__ == "__main__":
    app.run(host="0.0.0.0", port=5000)
Click Here to Copy Python Code

Create Dockerfile

FROM python:3.8-slim
WORKDIR /app
COPY app.py .
RUN pip install flask
CMD ["python", "app.py"]

Build and Load the Image into Minikube

eval $(minikube docker-env
docker build -t myapp:blue .

Step 2: Deploy the Blue Version

Create blue-deployment.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: blue-app
  labels:
    app: myapp
    version: blue
spec:
  replicas: 2
  selector:
    matchLabels:
      app: myapp
      version: blue
  template:
    metadata:
      labels:
        app: myapp
        version: blue
    spec:
      containers:
      - name: app
        image: myapp:blue
        ports:
        - containerPort: 5000
Click Here to Copy YAML

Apply the Deployment

kubectl apply -f blue-deployment.yaml

Create service.yaml to expose Blue version

apiVersion: v1
kind: Service
metadata:
  name: myapp-service
spec:
  selector:
    app: myapp
  ports:
    - protocol: TCP
      port: 80
      targetPort: 5000
  type: LoadBalancer
Click Here to Copy YAML

Apply the Service

kubectl apply -f service.yaml
minikube service myapp-service

Now, you can access http:// and see:

Step 3: Deploy the Green Version

Modify app.py to change the response message:

Modify app.py

@app.route("/")
def home():
    return "Hello from the Green Version!"
Click Here to Copy Python Code

Rebuild and Load the Green Version

docker build -t myapp:green .

Create green-deployment.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: green-app
  labels:
    app: myapp
    version: green
spec:
  replicas: 2
  selector:
    matchLabels:
      app: myapp
      version: green
  template:
    metadata:
      labels:
        app: myapp
        version: green
    spec:
      containers:
      - name: app
        image: myapp:green
        ports:
        - containerPort: 5000
Click Here to Copy YAML

Apply Green Deployment

kubectl apply -f green-deployment.yaml

At this point, both Blue (old version) and Green (new version) exist, but traffic still flows to Blue.

Step 4: Switch Traffic to the Green Version

To shift traffic, modify the Service Selector to point to the Green version instead of Blue.

Update service.yaml

spec:
  selector:
    app: myapp
    version: green

Apply the Service Update

kubectl apply -f service.yaml

Now, accessing http:// will show:

Traffic has successfully switched from Blue to Green!

Step 5: Rollback to Blue

If issues arise in the Green version, simply change the Service selector back:

Modify service.yaml again

spec:
  selector:
    app: myapp
    version: blue

Apply the Rollback

kubectl apply -f service.yaml

Now, traffic is back to the Blue Version instantly—without redeploying anything!

Conclusion

Blue-Green Deployments in Kubernetes offer a seamless way to update applications with zero downtime. The ability to instantly switch traffic between versions ensures quick rollbacks, reducing deployment risks.

No downtime during deployments
Easy rollback if issues arise
No impact on end-users

How do you handle deployments in Kubernetes? Let me know in the comments!👇

Implementing Chaos Engineering in Kubernetes with Chaos Mesh

Introduction

In distributed systems, failures are inevitable. Chaos Engineering is a proactive approach to testing system resilience by introducing controlled disruptions. Chaos Mesh is an open-source Chaos Engineering platform specifically designed for Kubernetes, enabling the simulation of various faults like pod crashes, network delays, and CPU stress.

This blog post will walk you through:

  • Installing Chaos Mesh on your Kubernetes cluster
  • Deploying a sample application
  • Executing chaos experiments to test system resilience
  • Observing and understanding system behavior under stress

Installing Chaos Mesh

Chaos Mesh can be installed using Helm, a package manager for Kubernetes.

Prerequisites

  • A running Kubernetes cluster
  • Helm installed on your local machine

Step 1: Add the Chaos Mesh Helm Repository

Add the official Chaos Mesh Helm repository:

helm repo add chaos-mesh https://charts.chaos-mesh.org

Step 2: Create the Chaos Mesh Namespace

It’s recommended to install Chaos Mesh in a dedicated namespace:

kubectl create namespace chaos-mesh

Step 3: Install Chaos Mesh

Install Chaos Mesh using Helm:

helm install chaos-mesh chaos-mesh/chaos-mesh -n chaos-mesh

This command deploys Chaos Mesh components, including the controller manager and Chaos Dashboard, into your Kubernetes cluster.

Step 4: Verify the Installation

Check the status of the Chaos Mesh pods:

kubectl get pods -n chaos-mesh

All pods should be in the Running state.

For more detailed installation instructions and configurations, refer to the official Chaos Mesh documentation.

Deploying a Sample Application

To demonstrate Chaos Mesh’s capabilities, we’ll deploy a simple Nginx application.

Step 1: Create the Deployment YAML

Create a file named nginx-deployment.yaml with the following content:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx
  namespace: default
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
        - name: nginx
          image: nginx:latest
          ports:
            - containerPort: 80
Click Here to Copy YAML

Step 2: Deploy the Application

Apply the deployment to your Kubernetes cluster:

kubectl apply -f nginx-deployment.yaml

Step 3: Verify the Deployment

Ensure that the Nginx pods are running:

kubectl get pods -l app=nginx

You should see three running Nginx pods.

Running Chaos Experiments

With Chaos Mesh installed and a sample application deployed, we can now introduce controlled faults to observe how the system responds.

Experiment 1: Pod Failure

This experiment will randomly terminate one of the Nginx pods.

Step 1: Create the PodChaos YAML

Create a file named pod-failure.yaml with the following content:

apiVersion: chaos-mesh.org/v1alpha1
kind: PodChaos
metadata:
  name: pod-failure
  namespace: chaos-mesh
spec:
  action: pod-kill
  mode: one
  selector:
    namespaces:
      - default
    labelSelectors:
      app: nginx
  duration: 30s
  scheduler:
    cron: "@every 1m"
Click Here to Copy YAML

Step 2: Apply the Chaos Experiment

Apply the experiment to your cluster:

kubectl apply -f pod-failure.yaml

Step 3: Monitor the Experiment

Observe the behavior of the Nginx pods:

kubectl get pods -l app=nginx -w

You should see pods being terminated and restarted as per the experiment’s configuration.

For more details on simulating pod faults, refer to the Chaos Mesh documentation.

Experiment 2: Network Delay

This experiment introduces a network latency of 200ms to the Nginx pods.

Step 1: Create the NetworkChaos YAML

Create a file named network-delay.yaml with the following content:

apiVersion: chaos-mesh.org/v1alpha1
kind: NetworkChaos
metadata:
  name: network-delay
  namespace: chaos-mesh
spec:
  action: delay
  mode: all
  selector:
    namespaces:
      - default
    labelSelectors:
      app: nginx
  delay:
    latency: "200ms"
    correlation: "25"
    jitter: "50ms"
  duration: "30s"
  scheduler:
    cron: "@every 2m"
Click Here to Copy YAML

Step 2: Apply the Chaos Experiment

Apply the experiment to your cluster:

kubectl apply -f network-delay.yaml

Conclusion

Chaos Engineering is essential for improving the resilience of Kubernetes applications. With Chaos Mesh, you can simulate real-world failures in a controlled environment, ensuring that your applications can withstand unexpected disruptions.

By implementing Chaos Mesh and running experiments like pod failures and network delays, teams can proactively identify weaknesses and enhance system stability.

Start incorporating Chaos Engineering into your Kubernetes workflow today and build systems that are truly resilient!

What’s your experience with Chaos Engineering? Drop your thoughts below!👇

Building a Self-healing Kubernetes Application on Kubernetes

Introduction

In traditional application deployments, when a service crashes, it often requires manual intervention to restart, debug, and restore functionality. This leads to downtime, frustrated users, and operational overhead.

What if your application could automatically recover from failures?

With Kubernetes’ self-healing mechanisms, we can ensure applications restart automatically when they fail—without human intervention.

The Solution: Kubernetes Self-healing Mechanisms

Kubernetes provides several built-in mechanisms to maintain high availability and auto-recovery:

  • Pod Restart Policies → Automatically restarts failed containers.
  • ReplicaSets → Ensures a specified number of pod replicas are always running.
  • Node Failure Recovery → Reschedules pods to healthy nodes if a node crashes.
  • Persistent Storage (Optional) → Ensures data persists even when pods restart.

Step 1: Creating a Simple Web Application

We’ll create a basic Python Flask application that runs inside a Kubernetes pod. This will simulate a real-world web service.

Create a Python Web App

Create a file named app.py:

from flask import Flask

app = Flask(__name__)

@app.route("/")
def home():
    return "Hello, Kubernetes! Your app is self-healing."

if __name__ == "__main__":
    app.run(host="0.0.0.0", port=5000)
Click Here to Copy Python Code

Step 2: Creating a Docker Image

Build the Docker Image Inside Minikube

Set Minikube’s Docker environment:

eval $(minikube docker-env)

Now, build the image inside Minikube:

docker build -t self-healing-app:v1 .

Verify that the image exists:

docker images | grep self-healing-app

Step 3: Deploying to Kubernetes

Now, let’s create Kubernetes resources.

Create a Deployment

Create a file deployment.yaml:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: self-healing-app
spec:
  replicas: 2
  selector:
    matchLabels:
      app: self-healing-app
  template:
    metadata:
      labels:
        app: self-healing-app
    spec:
      restartPolicy: Always
      containers:
      - name: self-healing-container
        image: self-healing-app:v1
Click Here to Copy YAML

Apply the deployment:

kubectl apply -f deployment.yaml

Verify the running pods:

kubectl get pods

Step 4: Exposing the Application

To access the application, expose it as a KubernetesService.

Create service.yaml:

apiVersion: v1
kind: Service
metadata:
  name: self-healing-service
spec:
  type: NodePort
  selector:
    app: self-healing-app
  ports:
    - protocol: TCP
      port: 80
      targetPort: 5000
Click Here to Copy YAML

Apply the service:

kubectl apply -f service.yaml

Find the external access URL:

minikube service self-healing-service --url

Access your app in a browser or via curl:

curl <MINIKUBE_SERVICE_URL>

Step 5: Simulating Failures

Kill a Running Pod

Run:

kubectl delete pod -l app=self-healing-app

Kubernetes will automatically recreate the pod within seconds.

Simulate a Node Failure

If you’re using a multi-node cluster, cordon and drain a node:

kubectl cordon <NODE_NAME>
kubectl drain <NODE_NAME> --ignore-daemonsets --force

Pods will automatically be rescheduled on healthy nodes.

Conclusion

By leveraging Kubernetes built-in self-healing features, we’ve created a system that:

  • Automatically recovers from failures without manual intervention.
  • Ensures high availability using multiple replicas.
  • Prevents downtime, keeping the application running smoothly.

This approach reduces operational overhead and enhances reliability. Let me know if you have any questions in the comments!👇

Practical Knative: Building Serverless Functions on Kubernetes

Introduction

Serverless computing has revolutionized the way we build and deploy applications, allowing developers to focus on writing code rather than managing infrastructure. Kubernetes, the powerful container orchestration platform, provides several serverless frameworks, and Knative is one of the most popular solutions.

In this guide, we will explore Knative, its architecture, and how to build serverless functions that react to various events in a Kubernetes cluster.

The Problem: Managing Event-Based Processing

Traditional applications require developers to set up and manage servers, configure scaling policies, and handle infrastructure complexities. This becomes challenging when dealing with event-driven architectures, such as:

  • Processing messages from Kafka or NATS
  • Responding to HTTP requests
  • Triggering functions on cronjobs
  • Automating workflows inside Kubernetes

Manually setting up and scaling these workloads is inefficient. Knative solves this by providing a robust, event-driven serverless solution on Kubernetes.

What is Knative?

Knative is a Kubernetes-native serverless framework that enables developers to deploy and scale serverless applications efficiently. It eliminates the need for external FaaS (Function-as-a-Service) platforms and integrates seamlessly with Kubernetes events, message queues, and HTTP triggers.

Why Use Knative?

Built on Kubernetes with strong community support
Supports multiple runtimes: Python, Node.js, Go, Java, and more
Works with event sources like Kafka, NATS, HTTP, and Cron
Scales down to zero when no requests are received
Backed by major cloud providers like Google, Red Hat, and VMware

Installing Knative on Kubernetes

To start using Knative, first install its core components on your Kubernetes cluster.

Step 1: Install Knative Serving

Knative Serving is responsible for running serverless workloads. Install it using:

kubectl apply -f https://github.com/knative/serving/releases/latest/download/serving-crds.yaml
kubectl apply -f https://github.com/knative/serving/releases/latest/download/serving-core.yaml

Step 2: Install a Networking Layer

Knative requires a networking layer like Istio, Kourier, or Contour. To install Kourier:

kubectl apply -f https://github.com/knative/net-kourier/releases/latest/download/kourier.yaml
kubectl patch configmap/config-network --namespace knative-serving --type merge --patch '{"data":{"ingress-class":"kourier.ingress.networking.knative.dev"}}'

Verify that Knative is installed:

kubectl get pods -n knative-serving

Deploying a Serverless Function

Step 1: Writing a Function

Let’s create a simple Python function that responds to HTTP requests.

Create a file hello.py with the following content:

from flask import Flask
app = Flask(__name__)

@app.route('/')
def hello():
    return "Hello, Serverless World!"
Click Here to Copy Python Code

Step 2: Creating a Container Image

Build and push the image to a container registry:

docker build -t <your-dockerhub-username>/hello-knative .
docker push <your-dockerhub-username>/hello-knative

Step 3: Deploying the Function with Knative

Create a Knative service:

apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: hello-knative
spec:
  template:
    spec:
      containers:
      - image: <your-dockerhub-username>/hello-knative
Click Here to Copy YAML

Apply the YAML file:

kubectl apply -f hello-knative.yaml

Step 4: Testing the Function

Retrieve the function URL:

kubectl get ksvc hello-knative

Invoke the function:

curl http://<SERVICE-URL>

You should see the output:

Hello, Serverless World!

Using Event Triggers

Kafka Trigger

Knative Eventing enables you to trigger functions using Kafka topics. Install Knative Eventing:

kubectl apply -f https://github.com/knative/eventing/releases/latest/download/eventing-crds.yaml
kubectl apply -f https://github.com/knative/eventing/releases/latest/download/eventing-core.yaml

Create a Kafka trigger:

apiVersion: eventing.knative.dev/v1
kind: Trigger
metadata:
  name: kafka-trigger
spec:
  broker: default
  subscriber:
    ref:
      apiVersion: serving.knative.dev/v1
      kind: Service
      name: hello-knative
Click Here to Copy YAML

Apply the trigger:

kubectl apply -f kafka-trigger.yaml

Publish a message to Kafka:

echo '{"message": "Hello Kafka!"}' | kubectl -n knative-eventing exec -i kafka-producer -- kafka-console-producer.sh --broker-list my-cluster-kafka-bootstrap:9092 --topic my-topic

CronJob Trigger

Run a function every 5 minutes using a cron trigger:

apiVersion: sources.knative.dev/v1beta1
kind: PingSource
metadata:
  name: cron-trigger
spec:
  schedule: "*/5 * * * *"
  sink:
    ref:
      apiVersion: serving.knative.dev/v1
      kind: Service
      name: hello-knative
Click Here to Copy YAML

Apply the trigger:

kubectl apply -f cron-trigger.yaml

Conclusion

Knative provides a powerful and scalable serverless framework for Kubernetes. By integrating with HTTP, Kafka, and CronJob triggers, it enables truly event-driven serverless architectures without managing infrastructure.

What’s your experience with Knative? Let’s discuss in the comments! 👇

Using Kubernetes Jobs and CronJobs for Batch Processing

Introduction

Kubernetes provides powerful constructs for running batch workloads efficiently. Jobs and CronJobs enable reliable, scheduled, and parallel execution of tasks, making them perfect for data processing, scheduled reports, and maintenance tasks.

In this blog, we’ll explore:
✅ What Jobs and CronJobs are
✅ How to create and manage them
✅ Real-world use cases

Step 1: Understanding Kubernetes Jobs

A Job ensures that a task runs to completion. It can run a single pod, multiple pods in parallel, or restart failed ones until successful. Jobs are useful when you need to process a batch of data once (e.g., database migrations, log processing).

Creating a Kubernetes Job

Let’s create a Job that runs a simple batch script inside a pod.

YAML for a Kubernetes Job

apiVersion: batch/v1
kind: Job
metadata:
  name: batch-job
spec:
  template:
    spec:
      containers:
      - name: batch-job
        image: busybox
        command: ["sh", "-c", "echo 'Processing data...'; sleep 10; echo 'Job completed.'"]
      restartPolicy: Never
Click Here to Copy YAML

Apply the Job:

kubectl apply -f batch-job.yaml

Check Job status:

kubectl get jobs
kubectl logs job/batch-job

Cleanup:

kubectl delete job batch-job

Step 2: Using Kubernetes CronJobs for Scheduled Tasks

A CronJob runs Jobs at scheduled intervals, just like a traditional Linux cron job. It’s perfect for recurring data processing, backups, and report generation.

Creating a Kubernetes CronJob

Let’s schedule a Job that runs every minute and prints a timestamp.

YAML for a Kubernetes CronJob

apiVersion: batch/v1
kind: CronJob
metadata:
  name: scheduled-job
spec:
  schedule: "* * * * *"  # Runs every minute
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: scheduled-job
            image: busybox
            command: ["sh", "-c", "echo 'Scheduled job running at: $(date)'"]
          restartPolicy: OnFailure
Click Here to Copy YAML

Apply the CronJob:

kubectl apply -f cronjob.yaml

Check the CronJob execution:

kubectl get cronjobs
kubectl get jobs

View logs from the latest Job:

kubectl logs job/<job-name>

Delete the CronJob:

kubectl delete cronjob scheduled-job

Step 3: Use Cases of Jobs and CronJobs

  • Data Processing: Running batch scripts to clean and analyze data
  • Database Backups: Taking periodic snapshots of databases
  • Report Generation: Automating daily/monthly analytics reports
  • File Transfers: Scheduled uploads/downloads of files
  • System Maintenance: Automating cleanup of logs, cache, and unused resources

Conclusion

Kubernetes Jobs and CronJobs simplify batch processing and scheduled tasks, ensuring reliable execution even in distributed environments. By leveraging them, you automate workflows, optimize resources, and enhance reliability.

Are you using Kubernetes Jobs and CronJobs in your projects? Share your experiences in the comments!👇

Implementing Admission Controllers: Enforcing Organizational Policies in Kubernetes

Introduction

In a Kubernetes environment, ensuring compliance with security and operational policies is critical. Admission controllers provide a mechanism to enforce organizational policies at the API level before resources are created, modified, or deleted.

In this post, we will build a simple admission controller in Minikube. Instead of using a custom image, we will leverage a lightweight existing image (busybox) to demonstrate the webhook concept.

Why Use Admission Controllers?

Admission controllers help organizations enforce policies such as:
✅ Blocking privileged containers
✅ Enforcing resource limits
✅ Validating labels and annotations
✅ Restricting image sources

By implementing an admission webhook, we can inspect and validate incoming requests before they are persisted in the Kubernetes cluster.

Step 1: Create the Webhook Deployment

We will use busybox as the container image instead of a custom-built admission webhook image.

Create webhook-deployment.yaml:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: admission-webhook
spec:
  replicas: 1
  selector:
    matchLabels:
      app: admission-webhook
  template:
    metadata:
      labels:
        app: admission-webhook
    spec:
      containers:
      - name: webhook
        image: busybox
        command: ["/bin/sh", "-c", "echo Webhook Running; sleep 3600"]
        ports:
        - containerPort: 443
        volumeMounts:
        - name: certs
          mountPath: "/certs"
          readOnly: true
      volumes:
      - name: certs
        secret:
          secretName: admission-webhook-secret
Click Here to Copy YAML

Key Changes:

  • Using busybox instead of a custom image
  • The container prints “Webhook Running” and sleeps for 1 hour
  • Mounting a secret to hold TLS certificates

Step 2: Generate TLS Certificates

Kubernetes requires admission webhooks to communicate securely. We need to generate TLS certificates for our webhook server.

Run the following commands in Minikube:

openssl req -x509 -newkey rsa:4096 -keyout tls.key -out tls.crt -days 365 -nodes -subj "/CN=admission-webhook.default.svc"
kubectl create secret tls admission-webhook-secret --cert=tls.crt --key=tls.key

This creates a self-signed certificate and stores it in a Kubernetes Secret.

Step 3: Define the MutatingWebhookConfiguration

Now, let’s create a Kubernetes webhook configuration that tells the API server when to invoke our webhook.

Create webhook-configuration.yaml:

apiVersion: admissionregistration.k8s.io/v1
kind: MutatingWebhookConfiguration
metadata:
  name: admission-webhook
webhooks:
  - name: webhook.default.svc
    clientConfig:
      service:
        name: admission-webhook
        namespace: default
        path: "/mutate"
      caBundle: ""
    rules:
      - apiGroups: [""]
        apiVersions: ["v1"]
        operations: ["CREATE"]
        resources: ["pods"]
    admissionReviewVersions: ["v1"]
    sideEffects: None
Click Here to Copy YAML

Key Details:

  • The webhook applies to all Pods created in the cluster
  • The webhook will be called whenever a Pod is created
  • The caBundle field will be populated later

Apply the webhook configuration:

kubectl apply -f webhook-configuration.yaml

Step 4: Test the Webhook

Let’s check if the webhook is being triggered when a new Pod is created.

Create a test Pod:

apiVersion: v1
kind: Pod
metadata:
  name: test-pod
spec:
  containers:
  - name: test
    image: busybox
    command: ["sleep", "3600"]
Click Here to Copy YAML

Apply the Pod:

kubectl apply -f test-pod.yaml

If the webhook is working correctly, the admission controller should intercept the request and either allow or reject it based on the configured policies.

Step 5: Debugging and Logs

To check the logs of the webhook, run:

kubectl logs -l app=admission-webhook

If the webhook is not working as expected, ensure:

  • The webhook deployment is running (kubectl get pods)
  • The secret exists (kubectl get secret admission-webhook-secret)
  • The webhook configuration is applied (kubectl get mutatingwebhookconfigurations)

Conclusion

We successfully set up a custom Kubernetes admission controller. Instead of a custom-built webhook image, we used a minimal container (busybox) to simulate webhook functionality.

Key Takeaways:

  • Admission controllers enforce security policies before resources are created
  • Webhooks provide dynamic validation and policy enforcement
  • Minikube can be used to test webhooks without pushing images to remote registries

What’s your experience with admission controllers? Let’s discuss!👇

Building Kubernetes Operators: Automating Application Management

Introduction

Managing stateful applications in Kubernetes manually can be complex. Automating application lifecycle tasks—like deployment, scaling, and failover—requires encoding operational knowledge into software. This is where Kubernetes Operators come in.

Operators extend Kubernetes by using Custom Resource Definitions (CRDs) and controllers to manage applications just like native Kubernetes resources. In this guide, we’ll build an Operator for a PostgreSQL database, automating its lifecycle management.

Step 1: Setting Up the Operator SDK

To create a Kubernetes Operator, we use the Operator SDK, which simplifies scaffolding and controller development.

Install Operator SDK

If you haven’t installed Operator SDK, follow these steps:

export ARCH=$(uname -m)
curl -LO "https://github.com/operator-framework/operator-sdk/releases/latest/download/operator-sdk_linux_${ARCH}"
chmod +x operator-sdk_linux_${ARCH}
sudo mv operator-sdk_linux_${ARCH} /usr/local/bin/operator-sdk

Verify installation:

operator-sdk version

Step 2: Initializing the Operator Project

We start by initializing our Operator project:

operator-sdk init --domain mycompany.com --repo github.com/mycompany/postgres-operator --skip-go-version-check

This command:
✅ Sets up the project structure
✅ Configures Go modules
✅ Generates required manifests

Step 3: Creating the PostgreSQL API and Controller

Now, let’s create a Custom Resource Definition (CRD) and a controller:

operator-sdk create api --group database --version v1alpha1 --kind PostgreSQL --resource --controller

This generates:
api/v1alpha1/postgresql_types.go → Defines the PostgreSQL resource structure
controllers/postgresql_controller.go → Implements the logic to manage PostgreSQL instances

Step 4: Defining the Custom Resource (CRD)

Edit api/v1alpha1/postgresql_types.go to define the PostgreSQLSpec and PostgreSQLStatus:

package v1alpha1

import (
	metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
)

// PostgreSQLSpec defines the desired state
type PostgreSQLSpec struct {
	Replicas  int    `json:"replicas"`
	Image     string `json:"image"`
	Storage   string `json:"storage"`
}

// PostgreSQLStatus defines the observed state
type PostgreSQLStatus struct {
	ReadyReplicas int `json:"readyReplicas"`
}

// PostgreSQL is the Schema for the PostgreSQL API
type PostgreSQL struct {
	metav1.TypeMeta   `json:",inline"`
	metav1.ObjectMeta `json:"metadata,omitempty"`

	Spec   PostgreSQLSpec   `json:"spec,omitempty"`
	Status PostgreSQLStatus `json:"status,omitempty"`
}
Click Here to Copy Go Language

Register this CRD:

make manifests
make install

Step 5: Implementing the Controller Logic

Edit controllers/postgresql_controller.go to define how the Operator manages PostgreSQL:

package controllers

import (
	"context"

	databasev1alpha1 "github.com/mycompany/postgres-operator/api/v1alpha1"
	appsv1 "k8s.io/api/apps/v1"
	corev1 "k8s.io/api/core/v1"
	ctrl "sigs.k8s.io/controller-runtime"
	"sigs.k8s.io/controller-runtime/pkg/client"
)

type PostgreSQLReconciler struct {
	client.Client
}

func (r *PostgreSQLReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
	var postgres databasev1alpha1.PostgreSQL
	if err := r.Get(ctx, req.NamespacedName, &postgres); err != nil {
		return ctrl.Result{}, client.IgnoreNotFound(err)
	}

	deployment := &appsv1.Deployment{}
	if err := r.Get(ctx, req.NamespacedName, deployment); err != nil {
		// Define a new Deployment
		deployment = &appsv1.Deployment{
			ObjectMeta: postgres.ObjectMeta,
			Spec: appsv1.DeploymentSpec{
				Replicas: int32Ptr(int32(postgres.Spec.Replicas)),
				Template: corev1.PodTemplateSpec{
					Spec: corev1.PodSpec{
						Containers: []corev1.Container{{
							Name:  "postgres",
							Image: postgres.Spec.Image,
						}},
					},
				},
			},
		}
		if err := r.Create(ctx, deployment); err != nil {
			return ctrl.Result{}, err
		}
	}

	return ctrl.Result{}, nil
}

func int32Ptr(i int32) *int32 {
	return &i
}

func (r *PostgreSQLReconciler) SetupWithManager(mgr ctrl.Manager) error {
	return ctrl.NewControllerManagedBy(mgr).
		For(&databasev1alpha1.PostgreSQL{}).
		Complete(r)
}
Click Here to Copy Go Language

Step 6: Deploying the Operator

Build and push the Operator container:

make docker-build docker-push IMG=mycompany/postgres-operator:latest

Apply the Operator to the cluster:

make deploy IMG=mycompany/postgres-operator:latest

Step 7: Creating a PostgreSQL Custom Resource

Once the Operator is deployed, create a PostgreSQL instance:

apiVersion: database.mycompany.com/v1alpha1
kind: PostgreSQL
metadata:
  name: my-db
spec:
  replicas: 2
  image: postgres:13
  storage: 10Gi
Click Here to Copy YAML

Apply it:

kubectl apply -f postgresql-cr.yaml

Verify the Operator has created a Deployment:

kubectl get deployments

Step 8: Testing the Operator

Check if the PostgreSQL pods are running:

kubectl get pods

Describe the Custom Resource:

kubectl describe postgresql my-db

Delete the PostgreSQL instance:

kubectl delete postgresql my-db

Conclusion

We successfully built a Kubernetes Operator to manage PostgreSQL instances automatically. By encoding operational knowledge into software, Operators:
✅ Simplify complex application management
✅ Enable self-healing and auto-scaling
✅ Enhance Kubernetes-native automation

Operators are essential for managing stateful applications efficiently in Kubernetes.

What application would you like to automate with an Operator? Drop your thoughts in the comments!👇

Custom Resource Definitions: Extending Kubernetes the Right Way

Introduction

Kubernetes is powerful, but what if its built-in objects like Pods, Services, and Deployments aren’t enough for your application’s needs? That’s where Custom Resource Definitions (CRDs) come in!

In this post, I’ll walk you through:
✅ Why CRDs are needed
✅ How to create a CRD from scratch
✅ Implementing a custom controller
✅ Deploying and managing custom resources

Why Extend Kubernetes?

Kubernetes comes with a standard set of APIs (like apps/v1 for Deployments), but many applications require domain-specific concepts that Kubernetes doesn’t provide natively.

For example:
A database team might want a Database object instead of manually managing StatefulSets.
A security team might want a FirewallRule object to enforce policies at the cluster level.

With CRDs, you can define custom objects tailored to your use case and make them first-class citizens in Kubernetes!

Step 1: Creating a Custom Resource Definition (CRD)

A CRD allows Kubernetes to recognize new object types. Let’s create a CRD for a PostgreSQL database instance.

Save the following YAML as postgresql-crd.yaml:

apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: postgresqls.mycompany.com
spec:
  group: mycompany.com
  names:
    kind: PostgreSQL
    plural: postgresqls
    singular: postgresql
  scope: Namespaced
  versions:
    - name: v1alpha1
      served: true
      storage: true
      schema:
        openAPIV3Schema:
          type: object
          properties:
            spec:
              type: object
              properties:
                databaseName:
                  type: string
                storageSize:
                  type: string
                replicas:
                  type: integer
Click Here to Copy YAML

Apply the CRD to Kubernetes

kubectl apply -f postgresql-crd.yaml

Now, Kubernetes knows about the PostgreSQL resource!

Step 2: Creating a Custom Resource Instance

Let’s create an actual PostgreSQL instance using our CRD.

Save the following YAML as postgresql-instance.yaml:

apiVersion: mycompany.com/v1alpha1
kind: PostgreSQL
metadata:
  name: my-database
spec:
  databaseName: mydb
  storageSize: "10Gi"
  replicas: 2
Click Here to Copy YAML

Apply the Custom Resource

kubectl apply -f postgresql-instance.yaml

Kubernetes now understands PostgreSQL objects, but it won’t do anything with them yet. That’s where controllers come in!

Step 3: Building a Kubernetes Controller

A controller watches for changes in custom resources and performs necessary actions.

Here’s a basic Go-based controller using controller-runtime:

package controllers

import (
	"context"
	"fmt"

	"k8s.io/apimachinery/pkg/types"
	ctrl "sigs.k8s.io/controller-runtime"
	"sigs.k8s.io/controller-runtime/pkg/client"
)

type PostgreSQLReconciler struct {
	client.Client
}

func (r *PostgreSQLReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
	fmt.Println("Reconciling PostgreSQL instance:", req.NamespacedName)

	// Fetch the PostgreSQL instance
	var pgInstance PostgreSQL
	if err := r.Get(ctx, req.NamespacedName, &pgInstance); err != nil {
		return ctrl.Result{}, client.IgnoreNotFound(err)
	}

	// Implement database provisioning logic here

	return ctrl.Result{}, nil
}

func (r *PostgreSQLReconciler) SetupWithManager(mgr ctrl.Manager) error {
	return ctrl.NewControllerManagedBy(mgr).
		For(&PostgreSQL{}).
		Complete(r)
}
Click Here to Copy Go Language

Deploying the Controller

To deploy this, we use Kubebuilder and the Operator SDK:

operator-sdk init --domain mycompany.com --repo github.com/mycompany/postgres-operator
operator-sdk create api --group mycompany --version v1alpha1 --kind PostgreSQL --resource --controller
make manifests
make install
make run

Your Kubernetes Operator is now watching for PostgreSQL objects and taking action!

Step 4: Deploying and Testing the Operator

Apply the CRD and PostgreSQL resource:

kubectl apply -f postgresql-crd.yaml
kubectl apply -f postgresql-instance.yaml

Check if the custom resource is recognized:

kubectl get postgresqls.mycompany.com

Check the controller logs to see it processing the custom resource:

kubectl logs -l control-plane=controller-manager

If everything works, your PostgreSQL resource is being managed automatically!

Conclusion: Why Use CRDs?

  • Encapsulate Business Logic: No need to manually configure every deployment—just define a custom resource, and the operator handles it.
  • Standard Kubernetes API: Developers can use kubectl to interact with custom resources just like native Kubernetes objects.
  • Automated Workflows: Kubernetes Operators can provision, update, and heal application components automatically.

By implementing Custom Resource Definitions and Operators, you extend Kubernetes the right way—without hacking it!

What are some use cases where CRDs and Operators helped your team? Let’s discuss in the comments!👇