-
Bug
-
Resolution: Unresolved
-
Critical
-
None
-
2.11.1 GA, 2.12.0 GA, 2.13.2 GA
-
5
-
False
-
False
-
Not Started
-
Not Started
-
Not Started
-
Not Started
-
Not Started
-
Not Started
-
RHOAM Sprint 42, RHOAM Sprint 43, RHOAM Sprint 44
Apicast operator fails at random intervals and is subsequently restarted with an error like this
E0221 12:52:25.963270 1 leaderelection.go:361] Failed to update lock: resource name may not be empty I0221 12:52:26.589096 1 leaderelection.go:278] failed to renew lease apicast-operator/988b4062.3scale.net: timed out waiting for the condition {"level":"info","ts":1645447946.986577,"logger":"controller-runtime.manager.controller.apicast","msg":"Stopping workers","reconciler group":"apps.3scale.net","reconciler kind":"APIcast"} {"level":"error","ts":1645447947.510864,"logger":"setup","msg":"problem running manager","error":"leader election lost","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\t/remote-source/deps/gomod/pkg/mod/github.com/go-logr/zapr@v0.2.0/zapr.go:132\nmain.main\n\t/remote-source/app/main.go:102\nruntime.main\n\t/opt/rh/go-toolset-1.13/root/usr/lib/go-toolset-1.13-golang/src/runtime/proc.go:203"}
It happens at random intervals (might be a race condition) and OpenShift restarts the pod so it is not a blocking issue, however it causes instabilities while the operator restarts.
We have around 400+ restarts in just over 14 days. It might be an infrastructure problem on our side, but since we are not seeing this behaviour with any other operator, including 3scale operator, I do not think it is.