Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-1670

Precache container image in error state in managed clusters

XMLWordPrintable

    • None
    • False
    • Hide

      None

      Show
      None

      Description of problem:

      When configuring TALM to execute precaching prior to upgrade, the job that is created in the spoke clusters runs a container image that is always in error state with this output:
      
      time="2022-09-19T14:44:07Z" level=warning msg="Failed to decode the keys [\"storage.options.override_kernel_check\"] from \"/etc/containers/storage.conf\"."
      time="2022-09-19T14:44:12Z" level=error msg="While applying layer: ApplyLayer exit status 127 stdout:  stderr: storage-applyLayer: error while loading shared libraries: libsubid.so.3: cannot open shared object file: No such file or directory\n"
      Error: writing blob: adding layer with blob "sha256:f70d60810c69edad990aaf0977a87c6d2bcc9cd52904fa6825f08507a9b6e7bc": ApplyLayer exit status 127 stdout:  stderr: storage-applyLayer: error while loading shared libraries: libsubid.so.3: cannot open shared object file: No such file or directory

       

      Version-Release number of selected component (if applicable):

      4.11.z 4.12.z 

      How reproducible:

      Always

      Steps to Reproduce:

      1. Build the TALM operator from upstream bits and the recovery and precache images
      2. Create a policy to perform an upgrade in some of the managed clusters
      3. Create a CGU to enforce the policy enabling precache

      Actual results:

      The pod is created by the job but it is always in error state:
      $ oc get pods
      NAME              READY   STATUS   RESTARTS   AGE
      pre-cache-4n2vf   0/1     Error    0          66s
      
      Here you can see the output logs:
      
      $ oc logs -f pre-cache-4n2vf time="2022-09-19T14:44:07Z" level=warning msg="Failed to decode the keys [\"storage.options.override_kernel_check\"] from \"/etc/containers/storage.conf\"." time="2022-09-19T14:44:12Z" level=error msg="While applying layer: ApplyLayer exit status 127 stdout: stderr: storage-applyLayer: error while loading shared libraries: libsubid.so.3: cannot open shared object file: No such file or directory\n" Error: writing blob: adding layer with blob "sha256:f70d60810c69edad990aaf0977a87c6d2bcc9cd52904fa6825f08507a9b6e7bc": ApplyLayer exit status 127 stdout: stderr: storage-applyLayer: error while loading shared libraries: libsubid.so.3: cannot open shared object file: No such file or directory
      
      The container image I am using is the official 4.11 build, but it also happens with 4.12:
      
      $ oc get pods -oyaml | grep image image: quay.io/openshift-kni/cluster-group-upgrades-operator-precache:4.11 imagePullPolicy: IfNotPresent imagePullSecrets: image: quay.io/openshift-kni/cluster-group-upgrades-operator-precache:4.11 imageID: quay.io/openshift-kni/cluster-group-upgrades-operator-precache@sha256:d7ae4b087f6ba2398dde10212577ac7d6e8e6b2dab9f5a34ddec0c3dbe024ab2

      Expected results:

      The precache container image runs successfully and downloads all the container images required for the upgrade task.

      Additional info:

      Looks like something changed in the UBI base image used to build the container:
      
      https://bugzilla.redhat.com/show_bug.cgi?id=2084179
      

              vgrinber@redhat.com Vitaly Grinberg
              alosadag@redhat.com Alberto Losada
              Yang Liu Yang Liu
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated:
                Resolved: