Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-3279

Service-ca controller exits immediately with an error on sigterm

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Undefined Undefined
    • 4.12.z
    • 4.13.0
    • service-ca
    • None
    • None
    • False
    • Hide

      None

      Show
      None

      This is a clone of issue OCPBUGS-3195. The following is the description of the original issue:

      Description of problem:

      the service ca controller start func seems to return that error as soon as its context is cancelled (which seems to happen the moment the first signal is received): https://github.com/openshift/service-ca-operator/blob/42088528ef8a6a4b8c99b0f558246b8025584056/pkg/controller/starter.go#L24
      
      that apparently triggers os.Exit(1) immediately https://github.com/openshift/service-ca-operator/blob/42088528ef8a6a4b8c99b0f55824[…]om/openshift/library-go/pkg/controller/controllercmd/builder.go
      
      the lock release doesn't happen until the periodic renew tick breaks out https://github.com/openshift/service-ca-operator/blob/42088528ef8a6a4b8c99b0f55824[…]/vendor/k8s.io/client-go/tools/leaderelection/leaderelection.go
      
      seems unlikely that you'd reach the call to le.release() before the call to os.Exit(1) in the other goroutine

      Version-Release number of selected component (if applicable):

      4.13.0

      How reproducible:

      ~always

      Steps to Reproduce:

      1. oc delete -n openshift-service-ca pod <service-ca pod>
      

      Actual results:

      the old pod logs show:

      W1103 09:59:14.370594       1 builder.go:106] graceful termination failed, controllers failed with error: stopped

      and when a new pod comes up to replace it, it has to wait for a while before acquiring the leader lock

      I1103 16:46:00.166173       1 leaderelection.go:248] attempting to acquire leader lease openshift-service-ca/service-ca-controller-lock...
       .... waiting ....
      I1103 16:48:30.004187       1 leaderelection.go:258] successfully acquired lease openshift-service-ca/service-ca-controller-lock
      

      Expected results:

      new pod can acquire the leader lease without waiting for the old pod's lease to expire

      Additional info:

       

              slaznick@redhat.com Stanislav Láznička (Inactive)
              openshift-crt-jira-prow OpenShift Prow Bot
              Giriyamma Karagere Ramaswamy Giriyamma Karagere Ramaswamy (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated:
                Resolved: