VSCode Icon

File

Edit

View

Go

Run

Terminal

Help

Sejoon Kim - Visual Studio Code

My Articles

Technical insights and tutorials about DevOps, Kubernetes, AWS, and infrastructure automation practices.

14.01.2026

CrowdSec WAF on Kubernetes

Needed a WAF for public APIs. Chose CrowdSec (open-source). Hit three integration issues: PostgreSQL namespace, client IP preservation, log collection. Documenting the fixes.

24.12.2025

Reducing Docker Image Sizes by 70%

Our Docker images were 800MB+. Build times were slow, pulling images took forever. Spent a day optimizing - got images down to 200MB using multi-stage builds and Alpine base images.

17.12.2025

Adding Nodes to Kubernetes Cluster When Traffic Grew

Traffic increased 40% over 3 months. Nodes were running at 75% CPU. Ordered 2 new Hetzner servers and added them to the cluster. Took about 4 hours from ordering to nodes serving traffic.

10.12.2025

DNS Lookups Were Timing Out Randomly in Kubernetes

Applications occasionally failed DNS lookups with 5-second timeouts. Checked CoreDNS logs, CPU usage, network - everything looked fine. Turned out to be conntrack table exhaustion on worker nodes.

03.12.2025

Automating SSH Key Rotation on Hetzner Servers

Security audit said our SSH keys hadn't been rotated in 18 months. Wrote a script to rotate keys across all Hetzner servers and update them in Azure Key Vault. Took 3 hours to build, runs in 5 minutes.

26.11.2025

Enforcing Pod Security Standards Broke Half Our Deployments

Enabled Pod Security Standards in Kubernetes. Immediately broke 6 out of 12 applications because they were running as root or using privileged containers. Spent 2 days fixing them all.

19.11.2025

Moving Terraform State from Local Files to Azure Storage

We'd been storing Terraform state in Git (bad idea). Moved it to Azure Blob Storage with state locking. Migration took 30 minutes. Should have done this from the start.

12.11.2025

Debugging Random 502 Errors from NGINX Ingress

Users reported occasional 502 errors. Logs showed NGINX couldn't reach backend pods. Took a day to find the issue - pod readiness probes were too aggressive and marking healthy pods as not ready.

05.11.2025

Adding Trivy Scans to Our CI Pipeline

Integrated Trivy into GitLab CI to scan container images for vulnerabilities before deployment. Found 47 high-severity issues we didn't know about. Some were fixable, some weren't.

29.10.2025

Automating PostgreSQL Backups to Azure Blob Storage

Set up daily PostgreSQL backups from our Kubernetes cluster to Azure Blob Storage. Using pg_dump in a CronJob with lifecycle policies for retention. Cost is about €8/month for 30 days of backups.

22.10.2025

external-secrets Wasn't Syncing from Azure Key Vault

Secrets in Azure Key Vault were updated but pods kept using old values. Took 2 hours to figure out the sync interval setting and force a refresh. Notes on how external-secrets actually works.

15.10.2025

Hit Let's Encrypt Rate Limit While Testing cert-manager

Made a mistake while testing cert-manager configuration. Issued 20 certificates for the same domain in an hour. Got rate limited for a week. Notes on staging environment and rate limits.

08.10.2025

Downsizing Hetzner Servers We Don't Need

Looked at actual CPU and memory usage across our Kubernetes nodes. Found we were paying for servers we barely used. Saved €120/month by switching to smaller machines.

01.10.2025

Upgrading a Self-Managed Kubernetes Cluster Without Managed Services

Moving from Kubernetes 1.28 to 1.29 on bare metal Hetzner servers. No managed control plane to click 'upgrade' - we had to do it manually. Notes on what actually happened.

24.09.2025

Hetzner Network Issues and Why We Keep Backups Elsewhere

Hetzner's network had problems in their Falkenstein datacenter. Our services stayed up because we split workloads across regions and keep critical data in Azure.

17.09.2025

Kubernetes StatefulSet: A Deep Dive

Understanding StatefulSet internals, ordered pod management, persistent storage, and real-world use cases for stateful applications in Kubernetes

09.09.2025

Migration Diary Part 2: Moving Logs from Grafana Cloud to Kubernetes

Setting up Loki and Alloy for log aggregation in our Kubernetes cluster. Learning what all those Loki components actually do.

07.09.2025

Migration Diary Part 1: Moving Metrics from Grafana Cloud to Kubernetes

Moving our monitoring from Grafana Cloud to self-hosted Prometheus and Grafana on Kubernetes. Turns out most apps already had metrics support, just needed to enable it.

05.09.2025

Creating a Least-Privilege Monitoring User in Zalando Postgres Operator

How I solved the challenge of creating a monitoring-only user with minimal permissions in a GitOps-managed Postgres cluster

03.09.2025

Zero-Downtime Helm App Upgrade in Production

How to upgrade a Helm-managed application in production with zero downtime using GitOps and Kubernetes RollingUpdate strategy

29.08.2025

Why I'm Obsessed with Uptime: The Real Cost of Downtime

My journey into understanding why every millisecond matters in DevOps, and what the research taught me about building reliable systems

24.08.2025

Managing Secrets in Kubernetes with External Secrets Operator

A comprehensive guide to implementing External Secrets Operator for secure secret management in Kubernetes clusters