Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-43656

Image registry operator becomes degraded when setting management state to Removed when networkAccess is set to Internal

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done-Errata
    • Icon: Major Major
    • 4.15.z
    • 4.18.0
    • Image Registry
    • No
    • Rejected
    • False
    • Hide

      None

      Show
      None
    • Hide
      Previously, when the Image Registry Operator was configured in Azure with networkAccess:Internal, you could not successfully set managementState to Removed in the Operator configuration. This issue was caused by an authorization error that occurred when the Operator started to delete the storage. With this release, the Operator successfully deletes the storage account, which automatically deletes the storage container. The managementState status in the Operator configuration is updated to the Removed state.
      ====
      Previously, when the image registry operator was configured with "networkAccess: Internal" in Azure, it would not be possible to successfully set "managementState" to "Removed" in the operator configuration due to an authorization error when the operator tried to delete the storage container. This update makes the operator continue with the deletion of the storage account, which automatically deletes the storage container, resulting in a successful change into "Removed" state.
      Show
      Previously, when the Image Registry Operator was configured in Azure with networkAccess:Internal, you could not successfully set managementState to Removed in the Operator configuration. This issue was caused by an authorization error that occurred when the Operator started to delete the storage. With this release, the Operator successfully deletes the storage account, which automatically deletes the storage container. The managementState status in the Operator configuration is updated to the Removed state. ==== Previously, when the image registry operator was configured with "networkAccess: Internal" in Azure, it would not be possible to successfully set "managementState" to "Removed" in the operator configuration due to an authorization error when the operator tried to delete the storage container. This update makes the operator continue with the deletion of the storage account, which automatically deletes the storage container, resulting in a successful change into "Removed" state.
    • Bug Fix
    • Done

      This is a clone of issue OCPBUGS-43555. The following is the description of the original issue:

      This is a clone of issue OCPBUGS-43350. The following is the description of the original issue:

      This is a clone of issue OCPBUGS-42732. The following is the description of the original issue:

      Description of problem:

          The operator cannot succeed removing resources when networkAccess is set to Removed.
          It looks like the authorization error changes from bloberror.AuthorizationPermissionMismatch to bloberror.AuthorizationFailure after the storage account becomes private (networkAccess: Internal).
          This is either caused by weird behavior in the azure sdk, or in the azure api itself.
          The easiest way to solve it is to also handle bloberror.AuthorizationFailure here: https://github.com/openshift/cluster-image-registry-operator/blob/master/pkg/storage/azure/azure.go?plain=1#L1145
      
          The error condition is the following:
      
      status:
        conditions:
        - lastTransitionTime: "2024-09-27T09:04:20Z"
          message: "Unable to delete storage container: DELETE https://imageregistrywxj927q6bpj.blob.core.windows.net/wxj-927d-jv8fc-image-registry-rwccleepmieiyukdxbhasjyvklsshhee\n--------------------------------------------------------------------------------\nRESPONSE
            403: 403 This request is not authorized to perform this operation.\nERROR CODE:
            AuthorizationFailure\n--------------------------------------------------------------------------------\n\uFEFF<?xml
            version=\"1.0\" encoding=\"utf-8\"?><Error><Code>AuthorizationFailure</Code><Message>This
            request is not authorized to perform this operation.\nRequestId:ababfe86-301e-0005-73bd-10d7af000000\nTime:2024-09-27T09:10:46.1231255Z</Message></Error>\n--------------------------------------------------------------------------------\n"
          reason: AzureError
          status: Unknown
          type: StorageExists
        - lastTransitionTime: "2024-09-27T09:02:26Z"
          message: The registry is removed
          reason: Removed
          status: "True"
          type: Available 

      Version-Release number of selected component (if applicable):

          4.18, 4.17, 4.16 (needs confirmation), 4.15 (needs confirmation)

      How reproducible:

          Always

      Steps to Reproduce:

          1. Get an Azure cluster
          2. In the operator config, set networkAccess to Internal
          3. Wait until the operator reconciles the change (watch networkAccess in status with `oc get configs.imageregistry/cluster -oyaml |yq '.status.storage'`)
          4. In the operator config, set management state to removed: `oc patch configs.imageregistry/cluster -p '{"spec":{"managementState":"Removed"}}' --type=merge`
          5. Watch the cluster operator conditions for the error

      Actual results:

          

      Expected results:

          

      Additional info:

          

              fmissi Flavian Missi
              openshift-crt-jira-prow OpenShift Prow Bot
              XiuJuan Wang XiuJuan Wang
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated:
                Resolved: