Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-11487

Extend leaseDurationSeconds for KCM in SNO

XMLWordPrintable

    • Moderate
    • No
    • False
    • Hide

      None

      Show
      None
    • Hide
      8/17: seeking review of PR; KNIECO-7492
      8/1: a fix is proposed by Vitaly Grinberg which needs to be reviewed by the component owners.
      Show
      8/17: seeking review of PR; KNIECO-7492 8/1: a fix is proposed by Vitaly Grinberg which needs to be reviewed by the component owners.

      Description of problem:

      KCM will restart when leader election failed due to the restart of kube-apiserver. This is because the leaseDurationSeconds for KCM is too short in SNO scenario, which is 15 seconds.

      Version-Release number of selected component (if applicable):

      OCP 4.12

      How reproducible:

      100%

      Steps to Reproduce:

      1.Kill kube-apiserver in SNO
      $ oc exec -it -n openshift-kube-apiserver kube-apiserver-XXXXXX  -c kube-apiserver -- /bin/sh -c "kill 1" 
      2. Watch the KCM Pods 
      3.
      

      Actual results:

      The KCM will crash and restart

      Expected results:

      The KCM should survive

      Additional info:

      For other components, the leaseDurationSeconds almost meet this criteria:
      https://github.com/openshift/library-go/blob/6ac65c5454f9effede61a6e52e7fdb06a27fc26e/pkg/config/leaderelection/leaderelection.go#L148

            fkrepins@redhat.com Filip Krepinsky
            rhn-support-cchen Chen Chen
            ying zhou ying zhou
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

              Created:
              Updated:
              Resolved: