CPU Limits vs Memory Limits: When 'Survival' Means Different Things

In my previous post, I said: “CPU limits are about performance. Memory limits are about survival.” I stand by that statement, but I oversimplified what “survival” actually means depending on what kind of application you’re running. Let me break this down. Stateless Applications: The “Easy” Case When I wrote about memory limits being “about survival,” I was thinking primarily of stateless services. You know, the typical microservices: APIs, web servers, queue consumers. ...

March 19, 2026 · awbuana

The Resource Request You Think Is Saving Money Is Actually Breaking Your App

I thought I was being clever. When we migrated our services to Google Kubernetes Engine with auto scale profile optimized, I looked at our resource specs and saw an opportunity. Our pods were requesting 100m CPU but had limits set to 1000m. Ten times headroom! Surely we could tighten that up and save some money. So I did what seemed logical: I kept the limits high (just in case of traffic spikes) but dropped the requests even lower. 50m here, 25m there. The cluster was happy. Our costs went down. I patted myself on the back for being such a savvy engineer. ...

March 19, 2026 · awbuana

Why Your Pod Died (OOMKilled): The Difference Between CPU and Memory Limits

I used to think CPU and memory were the same. Not literally, of course. I knew one was for processing and one was for… well, memory. But when it came to Kubernetes resource limits, I treated them identically. Set a request, set a limit, let the scheduler do its thing. If the app needs more, it uses more, right? Wrong. Very, very wrong. And I learned this lesson at 2 AM on a Tuesday, when our primary API service went from “healthy” to CrashLoopBackOff in about 30 seconds. No warning. No graceful degradation. Just… dead. Then alive. Then dead again. ...

March 19, 2026 · awbuana