Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-42257

DaemonSet is reporting incorrect number of ready pods, causing pod flooding on specific OpenShift Container Platform 4 - Node

XMLWordPrintable

    • Important
    • None
    • False
    • Hide

      None

      Show
      None

      Description of problem:

      During the upgrade to OpenShift Container Platform 4.16, it is found that the upgrade is eventually stuck or shortly after, because the DNS Cluster Operator remains in Progressing state. Checking the details, it's found that dns-default DaemonSet is reporting incorrect number of ready pods, which is causing infinite number of pods being created on a specific OpenShift Container Platform 4 - Node
      
      $ omc get ds -n openshift-dns
      NAME            DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR            AGE
      dns-default     9         9         8       9            8           kubernetes.io/os=linux   203d
      node-resolver   9         9         9       9            9           kubernetes.io/os=linux   203d
      
      DaemonSet is reporting 8 pods available. When checking the details we can see that in fact 9 pods are running and thus available
      
      $ omc get pod -n openshift-dns | grep dns-default | grep -i running | wc -l
      9
      
      $ omc get pod -n openshift-dns | grep dns-default | grep -i running 
      dns-default-6xcfx     2/2     Running                  2          2d
      dns-default-7765n     2/2     Running                  2          2d
      dns-default-bt2ql     2/2     Running                  0          2d
      dns-default-ctw9h     2/2     Running                  0          2d
      dns-default-kjqq7     2/2     Running                  4          2d
      dns-default-ph7gr     2/2     Running                  0          2d
      dns-default-sdcss     2/2     Running                  2          2d
      dns-default-wgh54     2/2     Running                  0          2d
      dns-default-xnpqp     2/2     Running                  2          2d
      
      $ omc get pod -n openshift-dns | grep dns-default | grep -i unknown | wc -l
      1377
      
      $ omc get pod -n openshift-dns -o wide | grep dns-default | grep -i unknown 
      dns-default-2285j     0/2     ContainerStatusUnknown   0          2d    <none>          worker-0.example.com   <none>           <none>
      dns-default-22j4p     0/2     ContainerStatusUnknown   0          2d    <none>          worker-0.example.com   <none>           <none>
      dns-default-22zgs     0/2     ContainerStatusUnknown   0          2d    <none>          worker-0.example.com   <none>           <none>
      dns-default-28l9d     0/2     ContainerStatusUnknown   0          2d    <none>          worker-0.example.com   <none>           <none>
      dns-default-28v69     0/2     ContainerStatusUnknown   0          2d    <none>          worker-0.example.com   <none>           <none>
      dns-default-2928q     0/2     ContainerStatusUnknown   0          2d    <none>          worker-0.example.com   <none>           <none>
      dns-default-29ck5     0/2     ContainerStatusUnknown   0          2d    <none>          worker-0.example.com   <none>           <none>
      dns-default-29tgh     0/2     ContainerStatusUnknown   0          2d    <none>          worker-0.example.com   <none>           <none>
      dns-default-29wbq     0/2     ContainerStatusUnknown   0          2d    <none>          worker-0.example.com   <none>           <none>
      dns-default-29wcv     0/2     ContainerStatusUnknown   0          2d    <none>          worker-0.example.com   <none>           <none>
      dns-default-29wx9     0/2     ContainerStatusUnknown   0          2d    <none>          worker-0.example.com   <none>           <none>
      dns-default-2b8qf     0/2     ContainerStatusUnknown   0          2d    <none>          worker-0.example.com   <none>           <none>
      dns-default-2bfj5     0/2     ContainerStatusUnknown   0          2d    <none>          worker-0.example.com   <none>           <none>
      dns-default-2bl56     0/2     ContainerStatusUnknown   0          2d    <none>          worker-0.example.com   <none>           <none>
      dns-default-2bt2d     0/2     ContainerStatusUnknown   0          2d    <none>          worker-0.example.com   <none>           <none>
      dns-default-2bw64     0/2     ContainerStatusUnknown   0          2d    <none>          worker-0.example.com   <none>           <none>
      dns-default-2bw79     0/2     ContainerStatusUnknown   0          2d    <none>          worker-0.example.com   <none>           <none>
      dns-default-2cfkf     0/2     ContainerStatusUnknown   0          2d    <none>          worker-0.example.com   <none>           <none>
      dns-default-2cgbv     0/2     ContainerStatusUnknown   0          2d    <none>          worker-0.example.com   <none>           <none>
      dns-default-2dczz     0/2     ContainerStatusUnknown   0          2d    <none>          worker-0.example.com   <none>           <none>
      dns-default-2dsp8     0/2     ContainerStatusUnknown   0          2d    <none>          worker-0.example.com   <none>           <none>
      dns-default-2ffrs     0/2     ContainerStatusUnknown   0          2d    <none>          worker-0.example.com   <none>           <none>
      dns-default-2g6fk     0/2     ContainerStatusUnknown   0          2d    <none>          worker-0.example.com   <none>           <none>
      dns-default-2gqnq     0/2     ContainerStatusUnknown   0          2d    <none>          worker-0.example.com   <none>           <none>
      dns-default-2grqk     0/2     ContainerStatusUnknown   0          2d    <none>          worker-0.example.com   <none>           <none>
      dns-default-2h9zn     0/2     ContainerStatusUnknown   0          2d    <none>          worker-0.example.com   <none>           <none>
      dns-default-2k2tt     0/2     ContainerStatusUnknown   0          2d    <none>          worker-0.example.com   <none>           <none>
      dns-default-2k66k     0/2     ContainerStatusUnknown   0          2d    <none>          worker-0.example.com   <none>           <none>
      dns-default-2k6kn     0/2     ContainerStatusUnknown   0          2d    <none>          worker-0.example.com   <none>           <none>
      dns-default-2k9dr     0/2     ContainerStatusUnknown   0          2d    <none>          worker-0.example.com   <none>           <none>
      dns-default-2kjlt     0/2     ContainerStatusUnknown   0          2d    <none>          worker-0.example.com   <none>           <none>
      dns-default-2mhdd     0/2     ContainerStatusUnknown   0          2d    <none>          worker-0.example.com   <none>           <none>
      [...]
      
      

      Version-Release number of selected component (if applicable):

      OpenShift Container Platform 4.16.11
      

      How reproducible:

      Random
      

      Steps to Reproduce:

      1. N/A
      

      Actual results:

      As shown in the problem description. The DaemonSet is reporting a wrong state, causing massive amounts of pods being creating, flooding a specific OpenShift Container Platform 4 - Node.
      

      Expected results:

      The DaemonSet to report the proper state and therefore prevent pod flooding and unhappy Cluster Operator
      

      Additional info:

      
      

              fkrepins@redhat.com Filip Krepinsky
              rhn-support-sreber Simon Reber
              ying zhou ying zhou
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated: