CRITICALKubernetesContainer Orchestration

Pods stuck in CrashLoopBackOff after deployment

kubernetespodscrashloopbackoffoomkilleddebugging
Symptoms
  • Pod status shows CrashLoopBackOff in kubectl get pods
  • Restart counter keeps incrementing every few seconds
  • Application endpoints return 502 from the ingress
  • Readiness probe never succeeds
Root Cause
  • Container process exits with a non-zero status code shortly after start
  • Typical causes: missing environment variables, bad config map mount path, failing database connection, OOMKilled due to low memory limits
  • Kubelet backs off exponentially between restarts causing the CrashLoopBackOff state
Diagnosis
  • Inspect pod events: kubectl describe pod <pod>
  • Stream the previous container logs: kubectl logs <pod> --previous
  • Check resource usage and OOM signals: kubectl get events --field-selector involvedObject.name=<pod>
  • Validate the image runs locally: docker run --rm <image>
Fix
  • Fix the underlying error surfaced in `kubectl logs --previous`
  • Increase memory limits if the pod was OOMKilled:
  • resources:
      requests:
        memory: 256Mi
        cpu: 100m
      limits:
        memory: 512Mi
        cpu: 500m
    
  • Restart the rollout once fixed: `kubectl rollout restart deployment/<name>`
  • Prevention
    • Add a liveness and readiness probe that match real application health
    • Run `kubectl apply --dry-run=server` in CI to catch schema errors
    • Ship structured logs and wire alerts on restart count > 3 in 5m