Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-38894

Image registry unable to run due to permissions error

XMLWordPrintable

    • Critical
    • Yes
    • Approved
    • False
    • Hide

      None

      Show
      None
    • Release Note Not Required
    • In Progress

      This is a clone of issue OCPBUGS-38842. The following is the description of the original issue:

      Component Readiness has found a potential regression in the following test:

      [sig-cluster-lifecycle] pathological event should not see excessive Back-off restarting failed containers for ns/openshift-image-registry

      Probability of significant regression: 98.02%

      Sample (being evaluated) Release: 4.17
      Start Time: 2024-08-15T00:00:00Z
      End Time: 2024-08-22T23:59:59Z
      Success Rate: 94.74%
      Successes: 180
      Failures: 10
      Flakes: 0

      Base (historical) Release: 4.16
      Start Time: 2024-05-31T00:00:00Z
      End Time: 2024-06-27T23:59:59Z
      Success Rate: 100.00%
      Successes: 89
      Failures: 0
      Flakes: 0

      View the test details report at https://sippy.dptools.openshift.org/sippy-ng/component_readiness/test_details?Architecture=amd64&Architecture=amd64&FeatureSet=default&FeatureSet=default&Installer=ipi&Installer=ipi&Network=ovn&Network=ovn&NetworkAccess=default&Platform=aws&Platform=aws&Scheduler=default&SecurityMode=default&Suite=unknown&Suite=unknown&Topology=ha&Topology=ha&Upgrade=micro&Upgrade=micro&baseEndTime=2024-06-27%2023%3A59%3A59&baseRelease=4.16&baseStartTime=2024-05-31%2000%3A00%3A00&capability=Other&columnGroupBy=Platform%2CArchitecture%2CNetwork&component=Image%20Registry&confidence=95&dbGroupBy=Platform%2CArchitecture%2CNetwork%2CTopology%2CFeatureSet%2CUpgrade%2CSuite%2CInstaller&environment=amd64%20default%20ipi%20ovn%20aws%20unknown%20ha%20micro&ignoreDisruption=true&ignoreMissing=false&includeVariant=Architecture%3Aamd64&includeVariant=FeatureSet%3Adefault&includeVariant=Installer%3Aipi&includeVariant=Installer%3Aupi&includeVariant=Owner%3Aeng&includeVariant=Platform%3Aaws&includeVariant=Platform%3Aazure&includeVariant=Platform%3Agcp&includeVariant=Platform%3Ametal&includeVariant=Platform%3Avsphere&includeVariant=Topology%3Aha&minFail=3&pity=5&sampleEndTime=2024-08-22%2023%3A59%3A59&sampleRelease=4.17&sampleStartTime=2024-08-15%2000%3A00%3A00&testId=openshift-tests-upgrade%3A10a9e2be27aa9ae799fde61bf8c992f6&testName=%5Bsig-cluster-lifecycle%5D%20pathological%20event%20should%20not%20see%20excessive%20Back-off%20restarting%20failed%20containers%20for%20ns%2Fopenshift-image-registry

      Also hitting 4.17, I've aligned this bug to 4.18 so the backport process is cleaner.

      The problem appears to be a permissions error preventing the pods from starting:

      2024-08-22T06:14:14.743856620Z ln: failed to create symbolic link '/etc/pki/ca-trust/extracted/pem/directory-hash/ca-certificates.crt': Permission denied
      

      Originating from this code: https://github.com/openshift/cluster-image-registry-operator/blob/master/pkg/resource/podtemplatespec.go#L489

      Both 4.17 and 4.18 nightlies bumped rhcos and in there is an upgrade like this:

      container-selinux-3-2.231.0-1.rhaos4.16.el9-noarch container-selinux-3-2.231.0-2.rhaos4.17.el9-noarch

      With slightly different versions in each stream, but both were on 3-2.231.

      Hits other tests too:

      operator conditions image-registry
      Operator upgrade image-registry
      [sig-cluster-lifecycle] Cluster completes upgrade
      [sig-arch][Feature:ClusterUpgrade] Cluster should remain functional during upgrade [Disruptive] [Serial]
      [sig-arch][Feature:ClusterUpgrade] Cluster should be upgradeable after finishing upgrade [Late][Suite:upgrade]
      

              fmissi Flavian Missi
              openshift-crt-jira-prow OpenShift Prow Bot
              XiuJuan Wang XiuJuan Wang
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

                Created:
                Updated:
                Resolved: