Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-76559

DiskPressure on Control-plane because crio is not cleaning up properly

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Major Major
    • None
    • 4.18
    • Logging
    • None
    • None
    • False
    • Hide

      None

      Show
      None
    • None
    • Important
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Description of problem:

          Control Plane nodes are facing disk pressure alert because crio is not wiping images properly.
          

      Multiple pods are stuck in ContainerCreating state,

      ./oc get pods -A -o wide | egrep -iv "running|completed"
      NAMESPACE                                          NAME                                                              READY   STATUS                      RESTARTS           AGE     IP              NODE           NOMINATED NODE   READINESS GATES
      cert-manager                                       cert-manager-5cb99fc789-zpjhk                                     0/1     CreateContainerError        0                  6d20h       otstilwt0201   <none>           <none>
      cert-manager                                       cert-manager-cainjector-5545bd876-fdrr8                           0/1     CreateContainerError        0                  6d20h       otstilwt0201   <none>           <none>
      cert-manager                                       cert-manager-operator-controller-manager-74cb94f686-7xw8f         0/1     CreateContainerError        0                  6d20h       otstilwt0201   <none>           <none>
      cert-manager                                       cert-manager-webhook-6888856db4-rzs4p                             0/1     CreateContainerError        0                  6d20h       otstilwt0201   <none>           <none>
      gitops-application                                 gitops-application-application-controller-0                       0/1     CreateContainerError        0                  6d20h       otstilwt0201   <none>           <none>
      gitops-application                                 gitops-application-applicationset-controller-6f766b5d46-tw5bc     0/1     CreateContainerError        0                  6d20h       otstilwt0201   <none>           <none>
      gitops-application                                 gitops-application-redis-6df54b7699-lk4nx                         0/1     CrashLoopBackOff            1903 (3m30s ago)   6d20h       otstilwt0201   <none>           <none>
      gitops-application                                 gitops-application-repo-server-76896c59dd-4dzv9                   0/1     CreateContainerError        0                  6d20h       otstilwt0201   <none>           <none>
      gitops-application                                 gitops-application-server-69b8f767f9-hl48v                        0/1     CreateContainerError        0                  6d20h       otstilwt0201   <none>           <none>
      hitachi                                            hspc-csi-controller-5d978b465f-84m9f                              4/6     CrashLoopBackOff            1525 (42s ago)     2d19h       otsticpt0301   <none>           <none>
      hitachi                                            hspc-csi-node-q6pfm                                               0/2     CrashLoopBackOff            1657 (47s ago)     27d       otstilwt0101   <none>           <none>
      klimamonitoring                                    elasticsearch-klimamonitoring-es-default-0                        0/1     Init:ImagePullBackOff       0                  6d20h       otstilwt0201   <none>           <none>
      klimamonitoring                                    elasticsearch-klimamonitoring-es-default-1                        0/1     Init:ImagePullBackOff       0                  6d20h       otstilwt0201   <none>           <none>
      klimamonitoring                                    elasticsearch-klimamonitoring-es-default-2                        0/1     Init:ImagePullBackOff       0                  6d20h       otstilwt0201   <none>           <none>
      

       
      The overlay directory contains a very large number of entries:
       

      [root@]# pwd /var/lib/containers/storage/overlay [root@]# ls -ltrh | wc -l 109254

      Running crictlt timeout=120s rmi --prune does not remove any images or free disk space. 

      The only workaround was to apply the KCS https://access.redhat.com/solutions/5350721 but even after applying the KCS, issue is reproducing.
       

              jcantril@redhat.com Jeffrey Cantrill
              rhn-support-shrsharm Shreya Sharma
              None
              None
              Min Li Min Li
              None
              Votes:
              1 Vote for this issue
              Watchers:
              6 Start watching this issue

                Created:
                Updated: