-
Feature
-
Resolution: Unresolved
-
Critical
-
1.36.0
-
None
-
False
-
-
False
-
-
-
Test and Release 1.37
While the Operator has worked with no issues in all known installations, we have had a report from Amadeus saying that in their installation they could see the operator pod doing restarts with the following message.
E0710 05:21:26.456724 1 leaderelection.go:429] Failed to update lock optimitically: Put "https://10.226.0.1:443/apis/coordination.k8s.io/v1/namespaces/openshift-serverless-logic/leases/1be5e57d.kie.org": context deadline exceeded, falling back to slow path
E0710 05:21:26.456840 1 leaderelection.go:436] error retrieving resource lock openshift-serverless-logic/1be5e57d.kie.org: client rate limiter Wait returned an error: context deadline exceeded
I0710 05:21:26.456859 1 leaderelection.go:297] failed to renew lease openshift-serverless-logic/1be5e57d.kie.org: timed out waiting for the condition
see: https://access.redhat.com/support/cases/#/case/04193832
After investigation, these error might happen in busy clusters due to low by default values in the cluster leader lease configuration values.
This PR configure these values with production ready values, and also provides the ability to configure them via flags. This might help fine tunings in particuar customer installations if needed.