Troubleshooting
Troubleshooting Commands
Troubleshooting Commands
Common commands for troubleshooting clusters and tools (e.g., Kustomize).
Debugging Basics
When things go wrong in Kubernetes, your first step is usually to inspect the state of your Pods and Events. The kubectl command line tool provides several subcommands for this purpose.
get: Lists resources to check their status (e.g., is a podRunningorCrashLoopBackOff?).describe: Shows detailed information and recent events (useful for finding scheduling errors or image pull errors).logs: Prints the standard output/error of the container.
Official Kubernetes Troubleshooting Guide
Cluster Health
# Cluster Info
kubectl cluster-info
# Check API Health
curl -k "https://<CPL-IP>:6443/livez?verbose"
# Check Kubelet Health (from node)
curl http://127.0.0.1:10248/healthzCleanup Commands
Dangerous commands for cleaning up resources. Use with caution!
Delete Evicted/Error Pods
# Delete all text "Evicted"
kubectl get pods --all-namespaces | egrep "Evicted|Error|ContainerStatusUnknown" | awk '{print $1, $2}' | xargs -n2 sh -c 'kubectl delete pod -n "$0" "$1"'Delete Failed Jobs
kubectl get jobs.batch -o jsonpath='{range .items[?(@.status.failed==1)]}{.metadata.name}{"\n"}{end}' | xargs -r kubectl delete jobOverview Commands
List All Container Images
kubectl get pods --all-namespaces -o jsonpath="{.items[*].spec['initContainers', 'containers'][*].image}" |\
tr -s '[[:space:]]' '\n' |\
sort |\
uniq -cGet Pod Priorities
kubectl get pods -A -o custom-columns="NAME:.metadata.name,NAMESPACE:.metadata.namespace,PRIORITY:.spec.priorityClassName"Connectivity Check (Curling)
Bash one-liner to check connectivity return code and time:
while true; do echo -n "$(date '+%Y-%m-%d %H:%M:%S') - "; curl -s -o /dev/null -w "%{http_code}\n" --connect-timeout 1 --max-time 2 https://example.com || echo "FAIL"; sleep 1; done# Get pod status
kubectl get pods -n <namespace>
# Describe pod to see events and errors
kubectl describe pod <pod-name> -n <namespace>
# View logs
kubectl logs <pod-name> -n <namespace>
kubectl logs <pod-name> -n <namespace> --previous # Check logs of crashed containerNetworking
# Check service endpoints
kubectl get endpoints -n <namespace>
# Port forward for local access
kubectl port-forward <pod-name> 8080:80 -n <namespace>Helm
# List releases
helm list -A
# Get values of a release
helm get values <release-name> -n <namespace>
# Dry-run install/upgrade to see rendered manifests
helm upgrade --install <release-name> <chart-path> -f values.yaml --dry-runKustomize
# Build manifests locally to verify output
kubectl kustomize <path-to-kustomization-directory>
# Apply with prune (be careful!)
kubectl apply -k <path-to-kustomization-directory> --prune --all