Uploaded image for project: 'OpenShift Virtualization'
  1. OpenShift Virtualization
  2. CNV-62943

[4.20] new readinessProbe and livenessProbe in KMP are too low and cause crashloop

XMLWordPrintable

    • Important
    • Customer Reported
    • None

      Description of problem:

      in new 4.99 builds, KMP is now deployed with readinessProbe and livenessProbe in the manager container:
      
            readinessProbe:
              httpGet:
                path: /readyz
                port: webhook-server
                scheme: HTTPS
                httpHeaders:
                  - name: Content-Type
                    value: application/json
              initialDelaySeconds: 10
              timeoutSeconds: 1
              periodSeconds: 10
              successThreshold: 1
              failureThreshold: 3
            name: manager
            command:
              - /manager
            livenessProbe:
              httpGet:
                path: /healthz
                port: webhook-server
                scheme: HTTPS
                httpHeaders:
                  - name: Content-Type
                    value: application/json
              initialDelaySeconds: 15
              timeoutSeconds: 1
              periodSeconds: 20
              successThreshold: 1
              failureThreshold: 3
      
      It turns out, that on "real" clusters, these timers are too short and KMP doesn't have time to reach the ready state, and as a result the liveness probe causes the KMP pod to restart prematurely, casing a CrashLoopBackOff

      Version-Release number of selected component (if applicable):
      v4.99.0.rhel9-2317

      How reproducible:

      100%, on clusters with real workloads (not CI/test clusters)

      Steps to Reproduce:

      1. Install or upgrade CNV to the specified version
      2. check the status of the KMP pod
      3.
      

      Actual results:

      KMP pod is being crashlooped

      Expected results:

      KMP pod should be ready

      Additional info:

      caused by:
      https://github.com/k8snetworkplumbingwg/kubemacpool/pull/540
      
      This issue causes KMP not to function and blocks CNV upgrade.

       

              ralavi@redhat.com Ram Lavi
              ocohen@redhat.com Oren Cohen
              Yoss Segev Yoss Segev
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

                Created:
                Updated: