Question
Consider a deployment where I have deployed say 10 pods across 5 nodes. Consider the deployment is even so each node gets two pods. Now suppose two nodes suddenly go down ( may be their network switch has stopped suddenly). So how will k8s create those 4 pods on the remaining 3 nodes ? What is the logic it will use to distribute 4 pods across 3 nodes ?
Scenario Summary
- Desired replicas: 10 pods
- Nodes: 5 nodes (2 pods per node)
- 2 nodes suddenly go down → 4 pods are lost
- Remaining: 3 nodes
- Kubernetes needs to create 4 new pods on the remaining 3 nodes
How Kubernetes Handles This
Kubernetes does not try to keep the exact previous distribution. Instead, it follows this process:
1. Detection Phase
- The Node Controller detects that 2 nodes are NotReady (after
node-monitor-grace-period, default 40s). - Pods on those failed nodes are marked as Unknown → then Terminating.
- The ReplicaSet controller notices that current ready pods < desired replicas (10), so it creates 4 new Pods.
2. Scheduling Logic for the 4 New Pods
The kube-scheduler decides where to place these new pods using this priority order:
1. Feasibility (Can the pod be scheduled here?)
- Enough CPU/Memory requests available?
- Node is
Ready? - No conflicting taints without tolerations?
2. Scoring (Which node is best?)
- Least Requested (prefers nodes with more free resources)
- Balanced Resource Allocation
- Node Affinity / Anti-Affinity (if defined)
- Topology Spread Constraints (very important here)
- Inter-Pod Affinity
Default behavior (without Topology Spread):
- Kubernetes will try to spread pods, but it’s not strictly even.
- Likely distribution of the 4 new pods on 3 nodes: 2 + 1 + 1
Final state example:
Original remaining pods: 6 (2 per remaining node)
New pods: 4
Final distribution: Probably 4, 3, 3 or 4, 4, 2
Best Way to Control Distribution: Use Topology Spread Constraints
Here’s the recommended way to make Kubernetes behave more predictably:
With maxSkew: 1 in the above case:
- After rescheduling 4 pods, Kubernetes will try its best to achieve:
- Most even possible distribution → 4, 3, 3
Summary of Logic Kubernetes Uses
| Factor | Influence on Distribution |
|---|---|
| Resource Requests | Strong (avoids overloading nodes) |
| Topology Spread Constraints | Strongest control for even spread |
| Least Requested Priority | Prefers less loaded nodes |
| Node Affinity | Can bias toward specific nodes |
| Default Scheduler | Good spread, but not perfect |
Practical Tips
- Always define
topologySpreadConstraintsin production — especiallymaxSkew: 1. - Set reasonable
resources.requestsso nodes don’t get overloaded. - You can monitor rescheduling with:
Next Question
What happens if due to certain rules, it is not possible for k8s to create those 4 pods at all ?
This is a very important scenario. Here's exactly what happens when Kubernetes cannot schedule those 4 pods due to constraints.
What Happens When Scheduling Fails?
When Kubernetes cannot find suitable nodes for the pending pods, the following occurs:
-
Pods enter
Pendingstate- The ReplicaSet still wants 10 pods, but only 6 are running.
- The 4 new pods stay in Pending status indefinitely (until the situation changes).
-
No automatic "force scheduling"
- Kubernetes will not kill existing pods to make space.
- It will not violate hard constraints (like
requirednode affinity, taints, or very strict Topology Spread).
-
The cluster remains in an under-capacity state
- You will have only 6 running pods instead of 10.
- Your application will run with reduced capacity.
No comments:
Post a Comment