-
Bug
-
Resolution: Done-Errata
-
Critical
-
4.19.0
-
Quality / Stability / Reliability
-
False
-
-
None
-
Critical
-
Yes
-
None
-
Approved
-
None
-
In Progress
-
Release Note Not Required
-
None
-
None
-
None
-
None
-
None
Description of problem:
During AzureStack installs, both Terraform and CAPZ (WIP) based, the ingress operator goes degraded:
The "default" ingress controller reports Available=False: IngressControllerUnavailable: One or more status conditions indicate unavailable: LoadBalancerReady=False (SyncLoadBalancerFailed: The service-controller component is reporting SyncLoadBalancerFailed events like: Error syncing load balancer: failed to ensure load balancer: EnsureLoadBalancer failed due to fail to lock azure resources. This may because another component is trying to update azure resources, e.g., load balancers. This will be automatically retried by cloud provider exponentially: leases.coordination.k8s.io "aks-managed-resource-locker" is forbidden: User "system:serviceaccount:kube-system:azure-cloud-provider" cannot get resource "leases" in API group "coordination.k8s.io" in the namespace "kube-system"...
Version-Release number of selected component (if applicable):
4.19ec3
How reproducible:
Always
Steps to Reproduce:
1. Use WIP cloud-provider-azure fix https://github.com/openshift/cloud-provider-azure/pull/141 2. Install to Azure Stack 3. check ingress co
Actual results:
Operator is degraded with rbac error above
Expected results:
Operator is available
Additional info:
This upstream change looks related:
https://github.com/kubernetes-sigs/cloud-provider-azure/pull/7344
As mentioned in steps to reproduce, I have a WIP fix for https://issues.redhat.com/browse/OCPBUGS-51090 which then results is this bug. If RBAC is related to node labels, these two bugs could be related. I think they are separate bugs, but worth mentioning.
Our Azure Stack environment is inaccessible to CI right now due to a new security policy from the provider. The provider is meeting today to put together a plan to move forward. I am happy to facilitate or test in any way, I just don't know how to fix this bug.
Must gather attached.
- links to
-
RHEA-2024:11038 OpenShift Container Platform 4.19.z bug fix update