-
Bug
-
Resolution: Done-Errata
-
Critical
-
4.14, 4.15
This is a clone of issue OCPBUGS-42933. The following is the description of the original issue:
—
This is a clone of issue OCPBUGS-42812. The following is the description of the original issue:
—
This is a clone of issue OCPBUGS-42514. The following is the description of the original issue:
—
Description of problem:
When configuring the OpenShift image registry to use a custom Azure storage account in a different resource group, following the official documentation [1], the image-registy CO degrade and upgrade from version 4.14.x to 4.15.x fails. The image registry operator reports misconfiguration errors related to Azure storage credentials, preventing the upgrade and causing instability in the control plane.
[1] Configuring registry storage in Azure user infrastructure
Version-Release number of selected component (if applicable):
4.14.33, 4.15.33
How reproducible:
- Set up ARO:
-
- Deploy an ARO or OpenShift cluster on Azure, version 4.14.x.
- Configure Image Registry:
-
- Follow the official documentation [1] to configure the image registry to use a custom Azure storage account located in a different resource group.
- Ensure that the image-registry-private-configuration-user secret is created in the openshift-image-registry namespace.
- Do not modify the installer-cloud-credentials secret.
- Check the image registry CO status
- Initiate Upgrade:
-
- Attempt to upgrade the cluster to OpenShift version 4.15.x.
Steps to Reproduce:
- If we have the image-registry-private-configuration-user inplace and installer-cloud-credentials with no modified
We got the error
NodeCADaemonProgressing: The daemon set node-ca is deployed Progressing: Unable to apply resources: unable to sync storage configuration: client misconfigured, missing 'TenantID', 'ClientID', 'ClientSecret', 'FederatedTokenFile', 'Creds', 'SubscriptionID' option(s)
The oeprator will also genreate a new secret image-registry-private-configuration with the same content as image-registry-private-configuration-user
$ oc get secret image-registry-private-configuration -o yaml apiVersion: v1 data: REGISTRY_STORAGE_AZURE_ACCOUNTKEY: xxxxxxxxxxxxxxxxx kind: Secret metadata: annotations: imageregistry.operator.openshift.io/checksum: sha256:524fab8dd71302f1a9ade9b152b3f9576edb2b670752e1bae1cb49b4de992eee creationTimestamp: "2024-09-26T19:52:17Z" name: image-registry-private-configuration namespace: openshift-image-registry resourceVersion: "126426" uid: e2064353-2511-4666-bd43-29dd020573fe type: Opaque
2. then we delete the secret image-registry-private-configuration-user
now the secret image-registry-private-configuration will still exisit with the same content, but image-registry CO got a new error
NodeCADaemonProgressing: The daemon set node-ca is deployed Progressing: Unable to apply resources: unable to sync storage configuration: failed to get keys for the storage account arojudesa: storage.AccountsClient#ListKeys: Failure responding to request: StatusCode=404 -- Original Error: autorest/azure: Service returned an error. Status=404 Code="ResourceNotFound" Message="The Resource 'Microsoft.Storage/storageAccounts/arojudesa' under resource group 'aro-ufjvmbl1' was not found. For more details please go to https://aka.ms/ARMResourceNotFoundFix"
3. apply the workaround to manually changeing the secret installer-cloud-credentials azure_resourcegroup key with custom storage account resourcegroup
$ oc get secret installer-cloud-credentials -o yaml apiVersion: v1 data: azure_client_id: xxxxxxxxxxxxxxxxx azure_client_secret: xxxxxxxxxxxxxxxxx azure_region: xxxxxxxxxxxxxxxxx azure_resource_prefix: xxxxxxxxxxxxxxxxx azure_resourcegroup: xxxxxxxxxxxxxxxxx <<<<<-----THIS azure_subscription_id: xxxxxxxxxxxxxxxxx azure_tenant_id: xxxxxxxxxxxxxxxxx kind: Secret metadata: annotations: cloudcredential.openshift.io/credentials-request: openshift-cloud-credential-operator/openshift-image-registry-azure creationTimestamp: "2024-09-26T16:49:57Z" labels: cloudcredential.openshift.io/credentials-request: "true" name: installer-cloud-credentials namespace: openshift-image-registry resourceVersion: "133921" uid: d1268e2c-1825-49f0-aa44-d0e1cbcda383 type: Opaque
The image-registry report healthy and this help the continue the upgrade
Actual results:
The image registry seems still use the service principal way for Azure storage account authentication
Expected results:
We expect the REGISTRY_STORAGE_AZURE_ACCOUNTKEY should the only thing image registry operator need for storage account authentication if Customer provide
- The image registry continues to function using the custom Azure storage account in the different resource group.
Additional info:
- Reproducibility: The issue is consistently reproducible by following the official documentation to configure the image registry with a custom storage account in a different resource group and then attempting an upgrade.
- Related Issues:
- Similar problems have been reported in previous incidents, suggesting a systemic issue with the image registry operator's handling of Azure storage credentials.
- Critical Customer Impact: Customers are required to perform manual interventions after every upgrade for each cluster, which is not sustainable and leads to operational overhead.
Slack : https://redhat-internal.slack.com/archives/CCV9YF9PD/p1727379313014789
- blocks
-
OCPBUGS-42935 Errors when the image registry is configured to use a custom Azure storage account located in a different resource group blocked the upgrade
- Closed
- clones
-
OCPBUGS-42933 Errors when the image registry is configured to use a custom Azure storage account located in a different resource group blocked the upgrade
- Closed
- is blocked by
-
OCPBUGS-42933 Errors when the image registry is configured to use a custom Azure storage account located in a different resource group blocked the upgrade
- Closed
- is cloned by
-
OCPBUGS-42935 Errors when the image registry is configured to use a custom Azure storage account located in a different resource group blocked the upgrade
- Closed
- links to
-
RHBA-2024:8425 OpenShift Container Platform 4.15.z bug fix update