Uploaded image for project: 'Red Hat 3scale API Management'
  1. Red Hat 3scale API Management
  2. THREESCALE-8205

Apicast Operator fails due to leader election lost

XMLWordPrintable

    • 5
    • False
    • False
    • Not Started
    • Not Started
    • Not Started
    • Not Started
    • Not Started
    • Not Started
    • RHOAM Sprint 42, RHOAM Sprint 43, RHOAM Sprint 44

      Apicast operator fails at random intervals and is subsequently restarted with an error like this

      E0221 12:52:25.963270       1 leaderelection.go:361] Failed to update lock: resource name may not be empty
      I0221 12:52:26.589096       1 leaderelection.go:278] failed to renew lease apicast-operator/988b4062.3scale.net: timed out waiting for the condition
      {"level":"info","ts":1645447946.986577,"logger":"controller-runtime.manager.controller.apicast","msg":"Stopping workers","reconciler group":"apps.3scale.net","reconciler kind":"APIcast"}
      {"level":"error","ts":1645447947.510864,"logger":"setup","msg":"problem running manager","error":"leader election lost","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\t/remote-source/deps/gomod/pkg/mod/github.com/go-logr/zapr@v0.2.0/zapr.go:132\nmain.main\n\t/remote-source/app/main.go:102\nruntime.main\n\t/opt/rh/go-toolset-1.13/root/usr/lib/go-toolset-1.13-golang/src/runtime/proc.go:203"}
      

      It happens at random intervals (might be a race condition) and OpenShift restarts the pod so it is not a blocking issue, however it causes instabilities while the operator restarts.
      We have around 400+ restarts in just over 14 days. It might be an infrastructure problem on our side, but since we are not seeing this behaviour with any other operator, including 3scale operator, I do not think it is.

              Unassigned Unassigned
              phala@redhat.com Petr Hála (Inactive)
              Unassigned Unassigned
              Votes:
              1 Vote for this issue
              Watchers:
              15 Start watching this issue

                Created:
                Updated: