-
Bug
-
Resolution: Unresolved
-
Major
-
None
-
4.16
-
Important
-
None
-
False
-
-
Description of problem:
During the upgrade to OpenShift Container Platform 4.16, it is found that the upgrade is eventually stuck or shortly after, because the DNS Cluster Operator remains in Progressing state. Checking the details, it's found that dns-default DaemonSet is reporting incorrect number of ready pods, which is causing infinite number of pods being created on a specific OpenShift Container Platform 4 - Node $ omc get ds -n openshift-dns NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE dns-default 9 9 8 9 8 kubernetes.io/os=linux 203d node-resolver 9 9 9 9 9 kubernetes.io/os=linux 203d DaemonSet is reporting 8 pods available. When checking the details we can see that in fact 9 pods are running and thus available $ omc get pod -n openshift-dns | grep dns-default | grep -i running | wc -l 9 $ omc get pod -n openshift-dns | grep dns-default | grep -i running dns-default-6xcfx 2/2 Running 2 2d dns-default-7765n 2/2 Running 2 2d dns-default-bt2ql 2/2 Running 0 2d dns-default-ctw9h 2/2 Running 0 2d dns-default-kjqq7 2/2 Running 4 2d dns-default-ph7gr 2/2 Running 0 2d dns-default-sdcss 2/2 Running 2 2d dns-default-wgh54 2/2 Running 0 2d dns-default-xnpqp 2/2 Running 2 2d $ omc get pod -n openshift-dns | grep dns-default | grep -i unknown | wc -l 1377 $ omc get pod -n openshift-dns -o wide | grep dns-default | grep -i unknown dns-default-2285j 0/2 ContainerStatusUnknown 0 2d <none> worker-0.example.com <none> <none> dns-default-22j4p 0/2 ContainerStatusUnknown 0 2d <none> worker-0.example.com <none> <none> dns-default-22zgs 0/2 ContainerStatusUnknown 0 2d <none> worker-0.example.com <none> <none> dns-default-28l9d 0/2 ContainerStatusUnknown 0 2d <none> worker-0.example.com <none> <none> dns-default-28v69 0/2 ContainerStatusUnknown 0 2d <none> worker-0.example.com <none> <none> dns-default-2928q 0/2 ContainerStatusUnknown 0 2d <none> worker-0.example.com <none> <none> dns-default-29ck5 0/2 ContainerStatusUnknown 0 2d <none> worker-0.example.com <none> <none> dns-default-29tgh 0/2 ContainerStatusUnknown 0 2d <none> worker-0.example.com <none> <none> dns-default-29wbq 0/2 ContainerStatusUnknown 0 2d <none> worker-0.example.com <none> <none> dns-default-29wcv 0/2 ContainerStatusUnknown 0 2d <none> worker-0.example.com <none> <none> dns-default-29wx9 0/2 ContainerStatusUnknown 0 2d <none> worker-0.example.com <none> <none> dns-default-2b8qf 0/2 ContainerStatusUnknown 0 2d <none> worker-0.example.com <none> <none> dns-default-2bfj5 0/2 ContainerStatusUnknown 0 2d <none> worker-0.example.com <none> <none> dns-default-2bl56 0/2 ContainerStatusUnknown 0 2d <none> worker-0.example.com <none> <none> dns-default-2bt2d 0/2 ContainerStatusUnknown 0 2d <none> worker-0.example.com <none> <none> dns-default-2bw64 0/2 ContainerStatusUnknown 0 2d <none> worker-0.example.com <none> <none> dns-default-2bw79 0/2 ContainerStatusUnknown 0 2d <none> worker-0.example.com <none> <none> dns-default-2cfkf 0/2 ContainerStatusUnknown 0 2d <none> worker-0.example.com <none> <none> dns-default-2cgbv 0/2 ContainerStatusUnknown 0 2d <none> worker-0.example.com <none> <none> dns-default-2dczz 0/2 ContainerStatusUnknown 0 2d <none> worker-0.example.com <none> <none> dns-default-2dsp8 0/2 ContainerStatusUnknown 0 2d <none> worker-0.example.com <none> <none> dns-default-2ffrs 0/2 ContainerStatusUnknown 0 2d <none> worker-0.example.com <none> <none> dns-default-2g6fk 0/2 ContainerStatusUnknown 0 2d <none> worker-0.example.com <none> <none> dns-default-2gqnq 0/2 ContainerStatusUnknown 0 2d <none> worker-0.example.com <none> <none> dns-default-2grqk 0/2 ContainerStatusUnknown 0 2d <none> worker-0.example.com <none> <none> dns-default-2h9zn 0/2 ContainerStatusUnknown 0 2d <none> worker-0.example.com <none> <none> dns-default-2k2tt 0/2 ContainerStatusUnknown 0 2d <none> worker-0.example.com <none> <none> dns-default-2k66k 0/2 ContainerStatusUnknown 0 2d <none> worker-0.example.com <none> <none> dns-default-2k6kn 0/2 ContainerStatusUnknown 0 2d <none> worker-0.example.com <none> <none> dns-default-2k9dr 0/2 ContainerStatusUnknown 0 2d <none> worker-0.example.com <none> <none> dns-default-2kjlt 0/2 ContainerStatusUnknown 0 2d <none> worker-0.example.com <none> <none> dns-default-2mhdd 0/2 ContainerStatusUnknown 0 2d <none> worker-0.example.com <none> <none> [...]
Version-Release number of selected component (if applicable):
OpenShift Container Platform 4.16.11
How reproducible:
Random
Steps to Reproduce:
1. N/A
Actual results:
As shown in the problem description. The DaemonSet is reporting a wrong state, causing massive amounts of pods being creating, flooding a specific OpenShift Container Platform 4 - Node.
Expected results:
The DaemonSet to report the proper state and therefore prevent pod flooding and unhappy Cluster Operator
Additional info:
- relates to
-
OCPBUGS-5807 ReplicaSet controller continuously creating pods failing due to SysctlForbidden
- New
-
OCPBUGS-16379 When a pod template in a deployment is specified with a matching `nodename` and a never-matching `nodeSelector` an unlimited number of pods are created
- New
- links to