Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-29637

image-registry co is degraded on Azure MAG, Azure Stack Hub cloud or with azure workload identity

    XMLWordPrintable

Details

    • Critical
    • Yes
    • Approved
    • False
    • Hide

      None

      Show
      None
    • Release Note Not Required
    • In Progress

    Description

      Description of problem:

      Install IPI cluster against 4.15 nightly build on Azure MAG and Azure Stack Hub or with Azure workload identity, image-registry co is degraded with different errors.
      
      On MAG:
      $ oc get co image-registry
      NAME             VERSION                              AVAILABLE   PROGRESSING   DEGRADED   SINCE   MESSAGE
      image-registry   4.15.0-0.nightly-2024-02-16-235514   True        False         True       5h44m   AzurePathFixControllerDegraded: Migration failed: panic: Get "https://imageregistryjima41xvvww.blob.core.windows.net/jima415a-hfxfh-image-registry-vbibdmawmsvqckhvmmiwisebryohfbtm?comp=list&prefix=docker&restype=container": dial tcp: lookup imageregistryjima41xvvww.blob.core.windows.net on 172.30.0.10:53: no such host...
      
      $ oc get pod -n openshift-image-registry
      NAME                                               READY   STATUS    RESTARTS        AGE
      azure-path-fix-ssn5w                               0/1     Error     0               5h47m
      cluster-image-registry-operator-86cdf775c7-7brn6   1/1     Running   1 (5h50m ago)   5h58m
      image-registry-5c6796b86d-46lvx                    1/1     Running   0               5h47m
      image-registry-5c6796b86d-9st5d                    1/1     Running   0               5h47m
      node-ca-48lsh                                      1/1     Running   0               5h44m
      node-ca-5rrsl                                      1/1     Running   0               5h47m
      node-ca-8sc92                                      1/1     Running   0               5h47m
      node-ca-h6trz                                      1/1     Running   0               5h47m
      node-ca-hm7s2                                      1/1     Running   0               5h47m
      node-ca-z7tv8                                      1/1     Running   0               5h44m
      
      $ oc logs azure-path-fix-ssn5w -n openshift-image-registry
      panic: Get "https://imageregistryjima41xvvww.blob.core.windows.net/jima415a-hfxfh-image-registry-vbibdmawmsvqckhvmmiwisebryohfbtm?comp=list&prefix=docker&restype=container": dial tcp: lookup imageregistryjima41xvvww.blob.core.windows.net on 172.30.0.10:53: no such hostgoroutine 1 [running]:
      main.main()
          /go/src/github.com/openshift/cluster-image-registry-operator/cmd/move-blobs/main.go:49 +0x125
      
      The blob storage endpoint seems not correct, should be:
      $ az storage account show -n imageregistryjima41xvvww -g jima415a-hfxfh-rg --query primaryEndpoints
      {
        "blob": "https://imageregistryjima41xvvww.blob.core.usgovcloudapi.net/",
        "dfs": "https://imageregistryjima41xvvww.dfs.core.usgovcloudapi.net/",
        "file": "https://imageregistryjima41xvvww.file.core.usgovcloudapi.net/",
        "internetEndpoints": null,
        "microsoftEndpoints": null,
        "queue": "https://imageregistryjima41xvvww.queue.core.usgovcloudapi.net/",
        "table": "https://imageregistryjima41xvvww.table.core.usgovcloudapi.net/",
        "web": "https://imageregistryjima41xvvww.z2.web.core.usgovcloudapi.net/"
      }
      
      On Azure Stack Hub:
      $ oc get co image-registry
      NAME             VERSION                              AVAILABLE   PROGRESSING   DEGRADED   SINCE   MESSAGE
      image-registry   4.15.0-0.nightly-2024-02-16-235514   True        False         True       3h32m   AzurePathFixControllerDegraded: Migration failed: panic: open : no such file or directory...
      
      $ oc get pod -n openshift-image-registry
      NAME                                               READY   STATUS    RESTARTS        AGE
      azure-path-fix-8jdg7                               0/1     Error     0               3h35m
      cluster-image-registry-operator-86cdf775c7-jwnd4   1/1     Running   1 (3h38m ago)   3h54m
      image-registry-658669fbb4-llv8z                    1/1     Running   0               3h35m
      image-registry-658669fbb4-lmfr6                    1/1     Running   0               3h35m
      node-ca-2jkjx                                      1/1     Running   0               3h35m
      node-ca-dcg2v                                      1/1     Running   0               3h35m
      node-ca-q6xmn                                      1/1     Running   0               3h35m
      node-ca-r46r2                                      1/1     Running   0               3h35m
      node-ca-s8jkb                                      1/1     Running   0               3h35m
      node-ca-ww6ql                                      1/1     Running   0               3h35m
      
      $ oc logs azure-path-fix-8jdg7 -n openshift-image-registry
      panic: open : no such file or directorygoroutine 1 [running]:
      main.main()
          /go/src/github.com/openshift/cluster-image-registry-operator/cmd/move-blobs/main.go:36 +0x145
      
      On cluster with Azure workload identity:
      Some operator's PROGRESSING is True
      image-registry                             4.15.0-0.nightly-2024-02-16-235514   True        True          False      43m     Progressing: The deployment has not completed...
      
      pod azure-path-fix is in CreateContainerConfigError status, and get error in its Event.
      
      "state": {
          "waiting": {
              "message": "couldn't find key REGISTRY_STORAGE_AZURE_ACCOUNTKEY in Secret openshift-image-registry/image-registry-private-configuration",
              "reason": "CreateContainerConfigError"
          }
      }                

      Version-Release number of selected component (if applicable):

      4.15.0-0.nightly-2024-02-16-235514    

      How reproducible:

          Always

      Steps to Reproduce:

          1. Install IPI cluster on MAG or Azure Stack Hub or config Azure workload identity
          2.
          3.
          

      Actual results:

          Installation failed and image-registry operator is degraded

      Expected results:

          Installation is successful.

      Additional info:

          Seems that issue is related with https://github.com/openshift/image-registry/pull/393

      Attachments

        Issue Links

          Activity

            People

              fmissi Flavian Missi
              jinyunma Jinyun Ma
              Wen Wang Wen Wang
              Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated: