Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-34488

The index image created by the latest opm will lead the Pod CrashLoopBackOff

XMLWordPrintable

    • Moderate
    • No
    • 1
    • OSDOCS Sprint 254, OSDOCS Sprint 255, OSDOCS Sprint 256
    • 3
    • Rejected
    • False
    • Hide

      None

      Show
      None

      Description of problem:

      If using quay.io/operator-framework/opm:latest to build index image, and catalog configure with extractContent, the pod will be CrashLoopBackOff.
      
      It is need to add one warning message for this situation.
      
      https://access.redhat.com/documentation/en-us/openshift_container_platform/4.15/html-single/cli_tools/index#cli-opm-ref
      
      
      
      
      xzha@xzha1-mac openshift-tests-private % oc logs test-index-3-5ldcd
      Defaulted container "registry-server" out of: registry-server, extract-utilities (init), extract-content (init)
      time="2024-05-27T07:21:29Z" level=info msg="starting pprof endpoint" address="localhost:6060"
      time="2024-05-27T07:21:29Z" level=fatal msg="cache directory has unexpected contents"    

      Version-Release number of selected component (if applicable):

      xzha@xzha1-mac openshift-tests-private % oc get clusterversion
      NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
      version   4.15.15   True        False         4h48m   Cluster version is 4.15.15  

      How reproducible:

          always

      Steps to Reproduce:

          1. create index image
      
      xzha@xzha1-mac 27672 % opm alpha list bundles catalog
      PACKAGE         CHANNEL     BUNDLE                 REPLACES  SKIPS  SKIP RANGE  IMAGE
      nginx-operator  alpha       nginx-operator.v0.0.1                               quay.io/olmqe/nginxolm-operator-bundle:v0.0.1-multi
      nginx-operator  channel-v0  nginx-operator.v0.0.1                               quay.io/olmqe/nginxolm-operator-bundle:v0.0.1-multi
      xzha@xzha1-mac 27672 % opm validate catalog
      xzha@xzha1-mac 27672 % 
      
      
      xzha@xzha1-mac 27672 % cat catalog.Dockerfile 
      # The base image is expected to contain
      # /bin/opm (with a serve subcommand) and /bin/grpc_health_probe
      FROM quay.io/operator-framework/opm:latest
      
      
      # Configure the entrypoint and command
      ENTRYPOINT ["/bin/opm"]
      CMD ["serve", "/configs", "--cache-dir=/tmp/cache"]
      
      
      # Copy declarative config root into image at /configs and pre-populate serve cache
      ADD catalog /configs
      RUN ["/bin/opm", "serve", "/configs", "--cache-dir=/tmp/cache", "--cache-only"]
      
      
      # Set DC-specific label for the location of the DC root directory
      # in the image
      LABEL operators.operatorframework.io.index.configs.v1=/configs
      
      podman manifest create quay.io/openshifttest/nginxolm-operator-index:27672-test-3
      
      podman build --platform linux/amd64,linux/arm64,linux/ppc64le,linux/s390x  --manifest quay.io/openshifttest/nginxolm-operator-index:27672-test-3 . -f catalog.Dockerfile
      ...
      [linux/amd64] STEP 5/6: RUN ["/bin/opm", "serve", "/configs", "--cache-dir=/tmp/cache", "--cache-only"]
      time="2024-05-27T10:58:01Z" level=warning msg="unable to set termination log path" error="open /dev/termination-log: permission denied"
      time="2024-05-27T10:58:01Z" level=info msg="starting pprof endpoint" address="localhost:6060"
      time="2024-05-27T10:58:01Z" level=info msg="cache directory is empty, using preferred backend" backend=pogreb.v1 cache=/tmp/cache configs=/configs
      time="2024-05-27T10:58:01Z" level=info msg="building cache" cache=/tmp/cache configs=/configs
      ...
      
       podman manifest push quay.io/openshifttest/nginxolm-operator-index:27672-test-3 
        
      
      2. create catsrc
      xzha@xzha1-mac openshift-tests-private % cat catsrc.yaml 
      apiVersion: operators.coreos.com/v1alpha1
      kind: CatalogSource
      metadata:
        name: test-index-3
        namespace: test-2
      spec:
        grpcPodConfig:
          extractContent:
            cacheDir: /tmp/cache
            catalogDir: /configs
          memoryTarget: 30Mi
        displayName: Test
        publisher: OLM-QE
        sourceType: grpc
        image: quay.io/openshifttest/nginxolm-operator-index:27672-test-3 
        updateStrategy:
          registryPoll:
            interval: 10m
      
      xzha@xzha1-mac openshift-tests-private % oc get pod
      NAME                 READY   STATUS             RESTARTS     AGE
      test-index-3-5ldcd   0/1     CrashLoopBackOff   1 (3s ago)   11s
      
      xzha@xzha1-mac openshift-tests-private % oc logs test-index-3-5ldcd
      Defaulted container "registry-server" out of: registry-server, extract-utilities (init), extract-content (init)
      time="2024-05-27T07:21:29Z" level=info msg="starting pprof endpoint" address="localhost:6060"
      time="2024-05-27T07:21:29Z" level=fatal msg="cache directory has unexpected contents
      
      xzha@xzha1-mac openshift-tests-private % oc describe pod  test-index-3-5ldcd
      Name:             test-index-3-5ldcd
      Namespace:        test-2
      Priority:         0
      Service Account:  test-index-3
      Node:             worker-1/192.168.111.24
      Start Time:       Mon, 27 May 2024 15:20:32 +0800
      Labels:           olm.catalogSource=test-index-3
                        olm.managed=true
                        olm.pod-spec-hash=9RIq3MfVcMI03CfEtdpZrRnfryNdSglkAYEZG2
      Annotations:      cluster-autoscaler.kubernetes.io/safe-to-evict: true
                        k8s.ovn.org/pod-networks:
                          {"default":{"ip_addresses":["10.129.2.112/23"],"mac_address":"0a:58:0a:81:02:70","gateway_ips":["10.129.2.1"],"routes":[{"dest":"10.128.0....
                        k8s.v1.cni.cncf.io/network-status:
                          [{
                              "name": "ovn-kubernetes",
                              "interface": "eth0",
                              "ips": [
                                  "10.129.2.112"
                              ],
                              "mac": "0a:58:0a:81:02:70",
                              "default": true,
                              "dns": {}
                          }]
                        openshift.io/scc: anyuid
      Status:           Running
      IP:               10.129.2.112
      IPs:
        IP:           10.129.2.112
      Controlled By:  CatalogSource/test-index-3
      Init Containers:
        extract-utilities:
          Container ID:  cri-o://50f8a40004b337b7f6e2c66a0daa4d9a9023d84cc1edbf75ad295f1974acb2bd
          Image:         quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:2c07d42deae3fd962c7ceeb52c7d17b80d312d504a05800eefc1454d5bfd6936
          Image ID:      quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:2c07d42deae3fd962c7ceeb52c7d17b80d312d504a05800eefc1454d5bfd6936
          Port:          <none>
          Host Port:     <none>
          Command:
            cp
          Args:
            /bin/copy-content
            /utilities/copy-content
          State:          Terminated
            Reason:       Completed
            Exit Code:    0
            Started:      Mon, 27 May 2024 15:20:32 +0800
            Finished:     Mon, 27 May 2024 15:20:32 +0800
          Ready:          True
          Restart Count:  0
          Environment:    <none>
          Mounts:
            /utilities from utilities (rw)
            /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-fs77t (ro)
        extract-content:
          Container ID:  cri-o://fe5d241dd09bf26c706562877b47d95513f3f8b635d0b2965aaf9fc0c29748a0
          Image:         quay.io/openshifttest/nginxolm-operator-index:27672-test-3
          Image ID:      quay.io/openshifttest/nginxolm-operator-index@sha256:2a6869b8feaff916570e2d10641a75a41162be4fccdbb3867fa0713939a014b7
          Port:          <none>
          Host Port:     <none>
          Command:
            /utilities/copy-content
          Args:
            --catalog.from=/configs
            --catalog.to=/extracted-catalog/catalog
            --cache.from=/tmp/cache
            --cache.to=/extracted-catalog/cache
          State:          Terminated
            Reason:       Completed
            Exit Code:    0
            Started:      Mon, 27 May 2024 15:20:36 +0800
            Finished:     Mon, 27 May 2024 15:20:36 +0800
          Ready:          True
          Restart Count:  0
          Environment:    <none>
          Mounts:
            /extracted-catalog from catalog-content (rw)
            /utilities from utilities (rw)
            /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-fs77t (ro)
      Containers:
        registry-server:
          Container ID:  cri-o://65d66c967793dd5514d07fcdbe9ce5747b56b90b4ce3b48e79757dedda82eb54
          Image:         quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:2ff3ec6daa6f6ae5aa9d7fe4c8251eecbd82e87597a9092754d2532524c1bb5f
          Image ID:      quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:2ff3ec6daa6f6ae5aa9d7fe4c8251eecbd82e87597a9092754d2532524c1bb5f
          Port:          50051/TCP
          Host Port:     0/TCP
          Command:
            /bin/opm
          Args:
            serve
            /extracted-catalog/catalog
            --cache-dir=/extracted-catalog/cache
          State:       Waiting
            Reason:    CrashLoopBackOff
          Last State:  Terminated
            Reason:    Error
            Message:   time="2024-05-27T07:23:57Z" level=info msg="starting pprof endpoint" address="localhost:6060"
      time="2024-05-27T07:23:57Z" level=fatal msg="cache directory has unexpected contents"
      
      
            Exit Code:    1
            Started:      Mon, 27 May 2024 15:23:57 +0800
            Finished:     Mon, 27 May 2024 15:23:57 +0800
          Ready:          False
          Restart Count:  5
          Requests:
            cpu:      10m
            memory:   30Mi
          Liveness:   exec [grpc_health_probe -addr=:50051] delay=10s timeout=5s period=10s #success=1 #failure=3
          Readiness:  exec [grpc_health_probe -addr=:50051] delay=5s timeout=5s period=10s #success=1 #failure=3
          Startup:    exec [grpc_health_probe -addr=:50051] delay=0s timeout=5s period=10s #success=1 #failure=10
          Environment:
            GOMEMLIMIT:  30MiB
          Mounts:
            /extracted-catalog from catalog-content (rw)
            /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-fs77t (ro)
      Conditions:
        Type              Status
        Initialized       True 
        Ready             False 
        ContainersReady   False 
        PodScheduled      True 
      Volumes:
        utilities:
          Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
          Medium:     
          SizeLimit:  <unset>
        catalog-content:
          Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
          Medium:     
          SizeLimit:  <unset>
        kube-api-access-fs77t:
          Type:                    Projected (a volume that contains injected data from multiple sources)
          TokenExpirationSeconds:  3607
          ConfigMapName:           kube-root-ca.crt
          ConfigMapOptional:       <nil>
          DownwardAPI:             true
          ConfigMapName:           openshift-service-ca.crt
          ConfigMapOptional:       <nil>
      QoS Class:                   Burstable
      Node-Selectors:              kubernetes.io/os=linux
      Tolerations:                 node.kubernetes.io/memory-pressure:NoSchedule op=Exists
                                   node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                                   node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
      Events:
        Type     Reason          Age                    From               Message
        ----     ------          ----                   ----               -------
        Normal   Scheduled       4m35s                  default-scheduler  Successfully assigned test-2/test-index-3-5ldcd to worker-1
        Normal   AddedInterface  4m35s                  multus             Add eth0 [10.129.2.112/23] from ovn-kubernetes
        Normal   Pulled          4m35s                  kubelet            Container image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:2c07d42deae3fd962c7ceeb52c7d17b80d312d504a05800eefc1454d5bfd6936" already present on machine
        Normal   Created         4m35s                  kubelet            Created container extract-utilities
        Normal   Started         4m35s                  kubelet            Started container extract-utilities
        Normal   Pulling         4m34s                  kubelet            Pulling image "quay.io/openshifttest/nginxolm-operator-index:27672-test-3"
        Normal   Started         4m31s                  kubelet            Started container extract-content
        Normal   Created         4m31s                  kubelet            Created container extract-content
        Normal   Pulled          4m31s                  kubelet            Successfully pulled image "quay.io/openshifttest/nginxolm-operator-index:27672-test-3" in 2.97s (2.97s including waiting)
        Normal   Pulled          4m29s                  kubelet            Successfully pulled image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:2ff3ec6daa6f6ae5aa9d7fe4c8251eecbd82e87597a9092754d2532524c1bb5f" in 1.215s (1.215s including waiting)
        Normal   Pulled          4m27s                  kubelet            Successfully pulled image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:2ff3ec6daa6f6ae5aa9d7fe4c8251eecbd82e87597a9092754d2532524c1bb5f" in 1.228s (1.228s including waiting)
        Normal   Pulling         4m11s (x3 over 4m30s)  kubelet            Pulling image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:2ff3ec6daa6f6ae5aa9d7fe4c8251eecbd82e87597a9092754d2532524c1bb5f"
        Normal   Created         4m10s (x3 over 4m29s)  kubelet            Created container registry-server
        Normal   Started         4m10s (x3 over 4m29s)  kubelet            Started container registry-server
        Normal   Pulled          4m10s                  kubelet            Successfully pulled image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:2ff3ec6daa6f6ae5aa9d7fe4c8251eecbd82e87597a9092754d2532524c1bb5f" in 1.233s (1.233s including waiting)
        Warning  BackOff         3m52s (x6 over 4m26s)  kubelet            Back-off restarting failed container registry-server in pod test-index-3-5ldcd_test-2(efa5606d-f6fd-46b2-91e0-8d0ad81a6939)
       3.
          

      Actual results:

          pod is CrashLoopBackOff

      Expected results:

          pod is running

      Additional info:

          if catsrc.yaml doesn't config extractContent, pod is running.
      
      xzha@xzha1-mac openshift-tests-private % cat catsrc.yaml.2 
      apiVersion: operators.coreos.com/v1alpha1
      kind: CatalogSource
      metadata:
        name: test-index-4
        namespace: test-2
      spec:
        displayName: Test
        publisher: OLM-QE
        sourceType: grpc
        image: quay.io/openshifttest/nginxolm-operator-index:27672-test-3 
        updateStrategy:
          registryPoll:
            interval: 10m
      
      xzha@xzha1-mac openshift-tests-private % oc get pod
      NAME                 READY   STATUS             RESTARTS      AGE
      test-index-3-5ldcd   0/1     CrashLoopBackOff   6 (18s ago)   6m26s
      test-index-4-ktbls   0/1     Running            0             5s

              rhn-support-mipeter Michael Peter
              rhn-support-xzha Xia Zhao
              Xia Zhao Xia Zhao
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated:
                Resolved: