Distributing PODs across nodes, as per the capabilities of nodes

Question

I have a frontend Service. I have a good server machine with 64 GB RAM and two desktops with 16 GB RAM each. I want to deploy 15 pods of my frontend service in such a way that my server box will run 9 pods, while the desktops will run 3 pods each. How do i instruct k8s to do this ?

This is a classic pod scheduling requirement where you want uneven distribution across nodes.

Step-by-Step Implementation

Step 1: Label your nodes

Run these commands:

# Label the powerful server
kubectl label node <server-node-name> node-type=powerful --overwrite

# Label the two desktops
kubectl label node <desktop1-node-name> node-type=normal --overwrite
kubectl label node <desktop2-node-name> node-type=normal --overwrite

You can check node names with:

kubectl get nodes -o wide

Step 2: Create Two Deployments

Deployment 1 – For the powerful server (9 pods):

apiVersion: apps/v1
kind: Deployment
metadata:
  name: frontend-powerful
spec:
  replicas: 9
  selector:
    matchLabels:
      app: frontend
  template:
    metadata:
      labels:
        app: frontend
    spec:
      nodeSelector:
        node-type: powerful
      containers:
      - name: frontend
        image: your-frontend-image:latest
        resources:
          requests:
            cpu: "500m"
            memory: "1Gi"
          limits:
            cpu: "1"
            memory: "2Gi"

Deployment 2 – For the two desktops (3 pods each = total 6):

apiVersion: apps/v1
kind: Deployment
metadata:
  name: frontend-normal
spec:
  replicas: 6
  selector:
    matchLabels:
      app: frontend
  template:
    metadata:
      labels:
        app: frontend
    spec:
      nodeSelector:
        node-type: normal
      containers:
      - name: frontend
        image: your-frontend-image:latest
        resources:
          requests:
            cpu: "500m"
            memory: "1Gi"
          limits:
            cpu: "1"
            memory: "2Gi"

Both Deployments use the same label app: frontend, so they will be part of the same logical application (you can use one Service to expose all 15 pods).

Alternative Approaches

Method	Pros	Cons	Recommendation
Multiple Deployments (above)	Simple, exact control (9 & 6)	Two Deployments to manage	Best for your case
Node Affinity (Preferred)	Single Deployment	Harder to guarantee exact 9/3/3	Good but less precise
Topology Spread Constraints	Built-in spreading	Designed for even spread, not uneven	Not ideal here
nodeName (hardcoding)	Very direct	Brittle, not scalable	Avoid

Important Tips

Resource Requests are very important here. Because your desktops have only 16GB RAM, set realistic requests so the scheduler doesn't overload them.
You can combine nodeSelector with Node Affinity for more flexibility (soft preference).
Monitor with:

kubectl get pods -o wide
kubectl describe node <node-name>

Second Approach : Node Affinity

Here's the Node Affinity version (single Deployment approach):

Step 1: Label your nodes (same as before)

kubectl label node <server-node-name>   node-type=powerful --overwrite
kubectl label node <desktop1-node-name> node-type=normal --overwrite
kubectl label node <desktop2-node-name> node-type=normal --overwrite

Step 2: Single Deployment with Node Affinity

apiVersion: apps/v1
kind: Deployment
metadata:
  name: frontend
spec:
  replicas: 15
  selector:
    matchLabels:
      app: frontend
  template:
    metadata:
      labels:
        app: frontend
    spec:
      affinity:
        nodeAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 100
            preference:
              matchExpressions:
              - key: node-type
                operator: In
                values:
                - powerful

          - weight: 30
            preference:
              matchExpressions:
              - key: node-type
                operator: In
                values:
                - normal

      containers:
      - name: frontend
        image: your-frontend-image:latest
        resources:
          requests:
            cpu: "500m"
            memory: "1Gi"
          limits:
            cpu: "1"
            memory: "2Gi"

How this works:

The powerful server has a much higher weight (100) → Kubernetes scheduler will strongly prefer scheduling pods on it.
The two desktops have lower weight (30) → They will still get pods, but fewer.
In practice, you will often see something close to 9~11 pods on the server and 2~3 pods on each desktop, depending on current cluster state.

Third Approach : Node Affinity + Topology Spread Constraints

Here's the advanced version combining Node Affinity + Topology Spread Constraints in a single Deployment.

Advanced YAML (Node Affinity + Topology Spread)

apiVersion: apps/v1
kind: Deployment
metadata:
  name: frontend
spec:
  replicas: 15
  selector:
    matchLabels:
      app: frontend
  template:
    metadata:
      labels:
        app: frontend
    spec:
      affinity:
        nodeAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 90
            preference:
              matchExpressions:
              - key: node-type
                operator: In
                values:
                - powerful

          - weight: 20
            preference:
              matchExpressions:
              - key: node-type
                operator: In
                values:
                - normal

      topologySpreadConstraints:
      - maxSkew: 4
        topologyKey: kubernetes.io/hostname
        whenUnsatisfiable: ScheduleAnyway
        labelSelector:
          matchLabels:
            app: frontend

      containers:
      - name: frontend
        image: your-frontend-image:latest
        resources:
          requests:
            cpu: "500m"
            memory: "1Gi"
          limits:
            cpu: "1"
            memory: "2Gi"

Final Recommendation

Even though the advanced version is powerful, for your exact requirement (9 on server, 3 on each desktop), the two separate Deployments approach remains the most reliable and production-friendly solution.

It gives:

Exact control over pod count
Predictable scheduling behavior
Cleaner troubleshooting
Independent scaling if needed later
Safer resource management for weaker nodes

1. How to monitor the actual distribution after applying this?

Use these commands to see how pods are distributed across your nodes:

Best Command (Recommended):

kubectl get pods -o wide -l app=frontend
  

This shows which node each pod is running on.

More Detailed Distribution Summary:

kubectl get pods -l app=frontend -o wide | awk '{print $7}' | sort | uniq -c
  

Even better one-liner for nice summary:

echo "=== Pod Distribution ===" && \
kubectl get pods -l app=frontend --no-headers -o custom-columns="NODE:.spec.nodeName" | \
sort | uniq -c | sort -nr
  

Watch in real-time:

watch -n 3 "kubectl get pods -l app=frontend --no-headers -o custom-columns='POD:.metadata.name,NODE:.spec.nodeName,STATUS:.status.phase' | sort -k2"
  

2. Commands to test and fine-tune weights / skew

Here are useful commands to experiment:

Check current scheduling decisions:

kubectl describe pod <pod-name> | grep -A 10 "Node:"
  

Fine-tuning commands:

A. Increase preference for powerful node (increase weight):

Change weight from 90 to 150 or 200 in Node Affinity.

B. Make distribution stricter (reduce maxSkew):

Change maxSkew: 4 → maxSkew: 2 (more even distribution)
Change maxSkew: 4 → maxSkew: 6 (allows more uneven)

C. Test changes quickly:

# Scale down and up to force rescheduling
kubectl scale deployment frontend --replicas=0
kubectl scale deployment frontend --replicas=15
    

D. Check scheduler events:

kubectl get events --sort-by=.metadata.creationTimestamp | tail -20
    

Pro Tip: Start with:

weight: 100 for powerful node
weight: 20 for normal nodes
maxSkew: 3

Then adjust based on the monitoring output.

3. Best of Both Worlds: Two Deployments + Topology Spread

This is my strong recommendation for your use case (9 on server, 3 on each desktop).

Deployment 1: Powerful Server (9 pods)

apiVersion: apps/v1
kind: Deployment
metadata:
  name: frontend-powerful
spec:
  replicas: 9
  selector:
    matchLabels:
      app: frontend
      tier: powerful
  template:
    metadata:
      labels:
        app: frontend
        tier: powerful
    spec:
      nodeSelector:
        node-type: powerful
      
      topologySpreadConstraints:
      - maxSkew: 2
        topologyKey: kubernetes.io/hostname
        whenUnsatisfiable: ScheduleAnyway
        labelSelector:
          matchLabels:
            app: frontend

      containers:
      - name: frontend
        image: your-frontend-image:latest
        resources:
          requests:
            cpu: "600m"
            memory: "1.2Gi"
          limits:
            cpu: "1200m"
            memory: "2.5Gi"
  

Deployment 2: Normal Desktops (Total 6 pods)

apiVersion: apps/v1
kind: Deployment
metadata:
  name: frontend-normal
spec:
  replicas: 6
  selector:
    matchLabels:
      app: frontend
      tier: normal
  template:
    metadata:
      labels:
        app: frontend
        tier: normal
    spec:
      nodeSelector:
        node-type: normal
      
      topologySpreadConstraints:
      - maxSkew: 1
        topologyKey: kubernetes.io/hostname
        whenUnsatisfiable: ScheduleAnyway
        labelSelector:
          matchLabels:
            app: frontend

      containers:
      - name: frontend
        image: your-frontend-image:latest
        resources:
          requests:
            cpu: "400m"
            memory: "800Mi"
          limits:
            cpu: "800m"
            memory: "1.5Gi"
  

Why This is the Best Approach:

Exact control: You get exactly 9 on server, 6 across desktops.
Topology Spread prevents both desktops from getting unbalanced (e.g., 5 + 1).
Different resource requests (higher on powerful node).
One Service can still target all pods using app: frontend.

RS Chandras Tech Blog | AI, ML, Agentic AI

Wednesday, May 27, 2026