Fix Kubernetes OOMKilled Error 137 in Java Spring Boot Apps

Seeing your Java Spring Boot pod stuck in a CrashLoopBackOff with a Kubernetes OOMKilled error is frustrating. This issue, often signaled by Exit Code 137, occurs when the container exceeds its allocated memory limit. Usually, it happens because the JVM does not realize it is restricted by a cgroup limit, causing it to grab more RAM than the cluster allows. You can stop these crashes by synchronizing your JVM heap settings with your Kubernetes resource limits using modern container-aware flags.

TL;DR — To resolve Exit Code 137, set your resources.limits.memory in Kubernetes and use the JVM flag -XX:MaxRAMPercentage=75.0. This ensures the JVM leaves 25% of the container's memory for non-heap overhead like Metaspace, stack threads, and native buffers.

Identifying the Symptoms of OOMKilled 137

💡 Analogy: Imagine a weightlifter (the JVM) practicing in a room with a very low ceiling (the Kubernetes Container Limit). If the weightlifter tries to lift a barbell too high, they hit the ceiling and the gym manager (the Linux Out-of-Memory Killer) immediately kicks them out of the building.

The first step in resolving this is confirming that the crash is indeed a memory issue. When a pod crashes due to resource exhaustion, Kubernetes provides a specific status. You can find this by running kubectl describe pod [POD_NAME]. Look for the Last State section in the output. If you see Reason: OOMKilled and Exit Code: 137, the Linux kernel terminated your process because it requested more memory than the cgroup limit permitted.

Unlike a standard java.lang.OutOfMemoryError which is handled inside the JVM and usually results in an Exit Code 1, Exit Code 137 is an external "SIGKILL." The operating system forcibly stops the process. This means your application won't have time to write a final log or a heap dump unless you have configured specific sidecars or volume mounts to capture them. You might notice the application running smoothly for hours, then suddenly disappearing during a traffic spike or a heavy batch processing task.

Why Java Apps Trigger OOMKilled Errors

JVM Container Unawareness

In older versions of Java (specifically Java 8 prior to update 191), the JVM was not "container aware." When the JVM starts, it looks at the host's total physical memory to calculate its default heap size. In a Kubernetes environment, the host might have 64GB of RAM, but your pod might only have a 2GB limit. The JVM sees 64GB and attempts to set a max heap size that far exceeds the container's 2GB ceiling. When the heap expands during runtime, it hits the 2GB limit, and the kernel kills the pod.

Non-Heap Memory Consumption

A common mistake is setting the JVM heap (-Xmx) to be exactly equal to the Kubernetes memory limit. For example, setting -Xmx2G and resources.limits.memory: 2Gi is a recipe for disaster. A Java process requires more than just heap memory. It needs RAM for the Metaspace (class metadata), Code Cache (JIT compiled code), Thread Stacks (usually 1MB per thread), and Direct Buffers (used for NIO). If you allocate 100% of your limit to the heap, these other areas will inevitably push the total process size over the limit, triggering the Kubernetes OOMKilled signal.

Native Memory Leaks

Sometimes, the issue isn't misconfiguration but a leak in native code. If you are using libraries that rely heavily on JNI (Java Native Interface) or frameworks like Netty for high-performance networking, memory might be allocated outside the garbage-collected heap. If this memory isn't released properly, the resident set size (RSS) of the container will grow until the system intervenes with an Exit Code 137. This is harder to debug because standard Java profilers may not show the growth in the native memory region.

How to Fix Exit Code 137 in Spring Boot

Use Modern JVM Container Flags

If you are using Java 11, 17, or 21, stop using fixed heap sizes like -Xmx. Instead, use percentage-based sizing. This makes your deployment more portable. By using -XX:MaxRAMPercentage, the JVM calculates the heap size based on the container limit rather than the host's physical RAM. For Spring Boot applications, a value between 70.0 and 80.0 is usually optimal.

# Example of a secure JVM configuration for Kubernetes
java -XX:MaxRAMPercentage=75.0 -XX:InitialRAMPercentage=75.0 -jar app.jar

Configure Kubernetes Resource Limits Correctly

You must explicitly define limits and requests in your Deployment YAML. Always ensure that your limits are slightly higher than your requests to allow for bursts, but keep the margin predictable. For Java, it is best practice to keep them close to avoid the JVM being moved to a different node frequently by the scheduler (Kube-scheduler).

apiVersion: apps/v1
kind: Deployment
metadata:
  name: spring-boot-app
spec:
  template:
    spec:
      containers:
      - name: app
        image: my-spring-app:latest
        resources:
          requests:
            memory: "1Gi"
            cpu: "500m"
          limits:
            memory: "1Gi"
            cpu: "1000m"
        env:
        - name: JAVA_OPTS
          value: "-XX:MaxRAMPercentage=75.0 -XX:ActiveProcessorCount=2"

⚠️ Common Mistake: Setting MaxRAMPercentage=100. This will cause an immediate OOMKilled error because it leaves zero bytes for the OS, thread stacks, and Metaspace. Never exceed 80% for the heap.

Optimize Garbage Collection for Containers

The Garbage Collector (GC) choice can impact memory spikes. For containers with less than 4GB of RAM, the Serial GC is often more memory-efficient than the default G1GC. While it has longer pause times, it uses significantly less native memory overhead. If your pod is small and you are struggling with Exit Code 137, try adding -XX:+UseSerialGC to your startup arguments.

Verifying the Memory Fix

After applying the changes, you need to verify that the JVM is respecting the container limits. You can do this by executing a shell inside the running pod. Use the command kubectl exec -it [POD_NAME] -- java -XX:+PrintFlagsFinal -version | grep MaxHeapSize. This will show you the exact byte value the JVM has calculated for its maximum heap. Compare this value to your Kubernetes limit; it should be roughly 75% of the total.

Additionally, monitor the pod's memory usage in real-time using kubectl top pod [POD_NAME]. In a healthy Spring Boot application, the memory usage should climb during startup, plateau during steady-state operation, and drop slightly after Garbage Collection cycles. If you see the memory usage constantly creeping up toward the limit without ever dropping, you likely have a memory leak rather than a configuration issue. For deeper insights, you can integrate Micrometer with Prometheus to visualize heap vs. non-heap usage in a Grafana dashboard.

Preventing Future Outages

To prevent Kubernetes OOMKilled issues from reaching production, implement Vertical Pod Autoscaler (VPA) in "Recommendation" mode. VPA can watch your application's actual memory consumption over time and suggest the ideal resource limits. This removes the guesswork from capacity planning. Furthermore, always set a terminationMessagePath in your container spec to capture the last few lines of the application log before the kernel kills it, which can provide clues if the OOM was preceded by a specific error.

📌 Key Takeaways

Exit Code 137 means the OS killed the process for exceeding container memory limits.
Java 11+ users should use -XX:MaxRAMPercentage=75.0 instead of -Xmx.
Leave a 20-25% "safety buffer" between your Heap size and the K8s memory limit.
Use kubectl describe pod to confirm the OOMKilled status.
Consider UseSerialGC for small pods (under 2GB RAM) to reduce overhead.

Frequently Asked Questions

Q. What is the difference between Exit Code 137 and 143?

A. Exit Code 137 is a SIGKILL (forceful termination by the OS, often due to OOM). Exit Code 143 is a SIGTERM (graceful termination request). If you see 143, Kubernetes is likely stopping your pod during a deployment or scaling event, not because of an error.

Q. Why does my Java app use more memory than the -Xmx value?

A. The -Xmx flag only limits the Heap. The total memory (RSS) includes Metaspace, stack memory (1MB per thread), Code Cache, and native buffers used by the OS and libraries. Total usage is usually 20-30% higher than the heap.

Q. Can I use -Xmx and MaxRAMPercentage together?

A. No, you should choose one. If both are provided, -Xmx usually takes precedence, which defeats the purpose of container-aware percentage sizing. For Kubernetes, MaxRAMPercentage is the recommended modern approach.

For further reading on memory management, check out the official Oracle GC Tuning Guide