Loading...

Type: Bug
Resolution: Unresolved
Priority: Minor
Fix Version/s: None
Affects Version/s: 4.13.z, 4.12.z, 4.14.z, 4.15.z, 4.16.0
Component/s: kube-controller-manager
Labels:
None

Severity:
Low
Regression:
No
Blocked:
False
Blocked Reason:

Hide

None

Show
None
RH Private Keywords:

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

PX Impact Score:
PX Priority Data:
PX Review Complete:

Description of problem:

When a pod template in a deployment is specified with a matching `nodename` and a never-matching `nodeSelector` an unlimited number of pods are created

Version-Release number of selected component (if applicable):

How reproducible: always

Steps to reproduce:

1. create a deployment using the following yaml:


apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: hello-node
  name: hello-node
spec:
  replicas: 1
  selector:
    matchLabels:
      app: hello-node
  template:
    metadata:
      labels:
        app: hello-node
    spec:
      nodeName: crc-74q6p-master-0
      containers:
        - image: ubi9/ubi
          name: test-container
          args:
            - bash
            - -c
            - |
              sleep 8000000
      nodeSelector:            
        hostname: crc-74q6p-master-0

where `crc-74q6p-master-0` is  a valid worker node.

Notice the never matching node selector, the correct label should be *kubernetes.io/hostname*

Actual results:

thousands of pods are created with an error status
$ oc get pods | head 
hello-node-66df88d668-224ft   0/1     NodeAffinity   0          3m46s
hello-node-66df88d668-22h68   0/1     NodeAffinity   0          4m48s
hello-node-66df88d668-22llx   0/1     NodeAffinity   0          68s
hello-node-66df88d668-246w8   0/1     NodeAffinity   0          118s

Expected results:

 A single pod is created in Pending Status.

Additional info:

This behaviour seems to happen only  if the non-matching selector has the *hostname* synthax as seen in the example. A selector with 

nodeSelector:
  foo: bar

does not exhibit this behaviour.

This the output of oc describe for a pod in error:

Name:             hello-node-66df88d668-zz7jg
Namespace:        too-many-pods
Priority:         0
Service Account:  default
Node:             crc-74q6p-master-0/
Start Time:       Wed, 19 Jul 2023 14:12:27 +0200
Labels:           app=hello-node
                  pod-template-hash=66df88d668
Annotations:      openshift.io/scc: restricted-v2
                  seccomp.security.alpha.kubernetes.io/pod: runtime/default
Status:           Failed
Reason:           NodeAffinity
Message:          Pod Predicate NodeAffinity failed
IP:               
IPs:              <none>
Controlled By:    ReplicaSet/hello-node-66df88d668
Containers:
  test-container:
    Image:      ubi9/ubi
    Port:       <none>
    Host Port:  <none>
    Args:
      bash
      -c
      sleep 8000000
      
    Environment:  <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-d68dm (ro)
Volumes:
  kube-api-access-d68dm:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
    ConfigMapName:           openshift-service-ca.crt
    ConfigMapOptional:       <nil>
QoS Class:                   BestEffort
Node-Selectors:              hostname=crc-74q6p-master-0
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason        Age   From     Message
  ----     ------        ----  ----     -------
  Warning  NodeAffinity  11m   kubelet  Predicate NodeAffinity failed

This is the output of `journalctl -fu kubelet` on the worker node:

Jul 19 12:24:42 crc-74q6p-master-0 kubenswrapper[2381]: I0719 12:24:42.956829    2381 kubelet.go:2119] "SyncLoop ADD" source="api" pods=[too-many-pods/hello-node-66df88d668-jpxgj]
Jul 19 12:24:42 crc-74q6p-master-0 kubenswrapper[2381]: I0719 12:24:42.960187    2381 topology_manager.go:205] "Topology Admit Handler"
Jul 19 12:24:42 crc-74q6p-master-0 kubenswrapper[2381]: I0719 12:24:42.965754    2381 predicate.go:129] "Predicate failed on Pod" pod="too-many-pods/hello-node-66df88d668-jpxgj" err="Predicate NodeAffinity failed"

is related to

OCPBUGS-5807 ReplicaSet controller continuously creating pods failing due to SysctlForbidden

New

OCPBUGS-42257 DaemonSet is reporting incorrect number of ready pods, causing pod flooding on specific OpenShift Container Platform 4 - Node

New

OCPBUGS-44737 Loop while creating pods with deployment using nodename for scheduling on particular node that has Noexcute Taint

New

OCPBUGS-48107 Deployment with OOMKilled Pod results in retriable failure loop creating thousands of ContainerStatusUnknown pods

ASSIGNED

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates