k8s PODs rescheduling in case of failure of nodes

Wednesday, May 27, 2026

k8s PODs rescheduling in case of failure of nodes

Question

Consider a deployment where I have deployed say 10 pods across 5 nodes. Consider the deployment is even so each node gets two pods. Now suppose two nodes suddenly go down ( may be their network switch has stopped suddenly). So how will k8s create those 4 pods on the remaining 3 nodes ? What is the logic it will use to distribute 4 pods across 3 nodes ?

Scenario Summary

Desired replicas: 10 pods
Nodes: 5 nodes (2 pods per node)
2 nodes suddenly go down → 4 pods are lost
Remaining: 3 nodes
Kubernetes needs to create 4 new pods on the remaining 3 nodes

How Kubernetes Handles This

Kubernetes does not try to keep the exact previous distribution. Instead, it follows this process:

1. Detection Phase

The Node Controller detects that 2 nodes are NotReady (after node-monitor-grace-period, default 40s).
Pods on those failed nodes are marked as Unknown → then Terminating.
The ReplicaSet controller notices that current ready pods < desired replicas (10), so it creates 4 new Pods.

2. Scheduling Logic for the 4 New Pods

The kube-scheduler decides where to place these new pods using this priority order:

1. Feasibility (Can the pod be scheduled here?)

Enough CPU/Memory requests available?
Node is Ready?
No conflicting taints without tolerations?

2. Scoring (Which node is best?)

Least Requested (prefers nodes with more free resources)
Balanced Resource Allocation
Node Affinity / Anti-Affinity (if defined)
Topology Spread Constraints (very important here)
Inter-Pod Affinity

Default behavior (without Topology Spread):

Kubernetes will try to spread pods, but it’s not strictly even.
Likely distribution of the 4 new pods on 3 nodes: 2 + 1 + 1

Final state example:

Original remaining pods: 6 (2 per remaining node)
New pods: 4

Final distribution: Probably 4, 3, 3 or 4, 4, 2

Best Way to Control Distribution: Use Topology Spread Constraints

Here’s the recommended way to make Kubernetes behave more predictably:

apiVersion: apps/v1

kind: Deployment

metadata:

  name: my-app

spec:

  replicas: 10

  template:

    spec:

      topologySpreadConstraints:

      - maxSkew: 1                          # Very important

        topologyKey: kubernetes.io/hostname

        whenUnsatisfiable: ScheduleAnyway

        labelSelector:

          matchLabels:

            app: my-app

      # Optional: Also spread across zones if you have them

      - maxSkew: 2

        topologyKey: topology.kubernetes.io/zone

        whenUnsatisfiable: ScheduleAnyway

With maxSkew: 1 in the above case:

After rescheduling 4 pods, Kubernetes will try its best to achieve:
Most even possible distribution → 4, 3, 3

Summary of Logic Kubernetes Uses

Factor	Influence on Distribution
Resource Requests	Strong (avoids overloading nodes)
Topology Spread Constraints	Strongest control for even spread
Least Requested Priority	Prefers less loaded nodes
Node Affinity	Can bias toward specific nodes
Default Scheduler	Good spread, but not perfect

Practical Tips

Always define topologySpreadConstraints in production — especially maxSkew: 1.
Set reasonable resources.requests so nodes don’t get overloaded.
You can monitor rescheduling with:

kubectl get pods -o wide -w

kubectl get events --sort-by=.lastTimestamp

Next Question

What happens if due to certain rules, it is not possible for k8s to create those 4 pods at all ?

This is a very important scenario. Here's exactly what happens when Kubernetes cannot schedule those 4 pods due to constraints.

What Happens When Scheduling Fails?

When Kubernetes cannot find suitable nodes for the pending pods, the following occurs:

Pods enter Pending state
- The ReplicaSet still wants 10 pods, but only 6 are running.
- The 4 new pods stay in Pending status indefinitely (until the situation changes).
No automatic "force scheduling"
- Kubernetes will not kill existing pods to make space.
- It will not violate hard constraints (like required node affinity, taints, or very strict Topology Spread).
The cluster remains in an under-capacity state
- You will have only 6 running pods instead of 10.
- Your application will run with reduced capacity.

RS Chandras Tech Blog | AI, ML, Agentic AI

Wednesday, May 27, 2026