-
Bug
-
Resolution: Not a Bug
-
Normal
-
None
-
4.15.z, 4.16.0
-
None
-
Important
-
No
-
Proposed
-
False
-
Description of problem:
Since 4.15.0, the default catalog source uses the opm instead of the index image, but the custom catalog source does not. This introduces a high test risk for the operator upgrading since QE cannot control the default catalog source updates.
The default catalog source and pod(used opm) as follows,
jiazha-mac:~ jiazha$ oc get catalogsource redhat-operators -o yaml apiVersion: operators.coreos.com/v1alpha1 kind: CatalogSource metadata: annotations: operatorframework.io/managed-by: marketplace-operator target.workload.openshift.io/management: '{"effect": "PreferredDuringScheduling"}' creationTimestamp: "2024-03-26T23:18:35Z" generation: 1 name: redhat-operators namespace: openshift-marketplace resourceVersion: "108928" uid: 826cf63d-3133-4aee-ae45-6d6433687353 spec: displayName: Red Hat Operators grpcPodConfig: extractContent: cacheDir: /tmp/cache catalogDir: /configs memoryTarget: 30Mi nodeSelector: kubernetes.io/os: linux node-role.kubernetes.io/master: "" priorityClassName: system-cluster-critical securityContextConfig: restricted tolerations: - effect: NoSchedule key: node-role.kubernetes.io/master operator: Exists - effect: NoExecute key: node.kubernetes.io/unreachable operator: Exists tolerationSeconds: 120 - effect: NoExecute key: node.kubernetes.io/not-ready operator: Exists tolerationSeconds: 120 icon: base64data: "" mediatype: "" image: registry.redhat.io/redhat/redhat-operator-index:v4.16 priority: -100 publisher: Red Hat sourceType: grpc updateStrategy: registryPoll: interval: 10m jiazha-mac:~ jiazha$ oc get pods redhat-operators-cl5p7 -o yaml apiVersion: v1 kind: Pod metadata: annotations: cluster-autoscaler.kubernetes.io/safe-to-evict: "true" k8s.ovn.org/pod-networks: '{"default":{"ip_addresses":["10.130.0.51/23"],"mac_address":"0a:58:0a:82:00:33","gateway_ips":["10.130.0.1"],"routes":[{"dest":"10.128.0.0/14","nextHop":"10.130.0.1"},{"dest":"172.30.0.0/16","nextHop":"10.130.0.1"},{"dest":"100.64.0.0/16","nextHop":"10.130.0.1"}],"ip_address":"10.130.0.51/23","gateway_ip":"10.130.0.1"}}' k8s.v1.cni.cncf.io/network-status: |- [{ "name": "ovn-kubernetes", "interface": "eth0", "ips": [ "10.130.0.51" ], "mac": "0a:58:0a:82:00:33", "default": true, "dns": {} }] openshift.io/scc: restricted-v2 operatorframework.io/managed-by: marketplace-operator seccomp.security.alpha.kubernetes.io/pod: runtime/default creationTimestamp: "2024-03-26T23:25:25Z" generateName: redhat-operators- labels: olm.catalogSource: redhat-operators olm.managed: "true" olm.pod-spec-hash: 56Q9FbY1uxGItNqkZMqUYVBotO34cT1AwPg0nO name: redhat-operators-cl5p7 namespace: openshift-marketplace ownerReferences: - apiVersion: operators.coreos.com/v1alpha1 blockOwnerDeletion: false controller: true kind: CatalogSource name: redhat-operators uid: 826cf63d-3133-4aee-ae45-6d6433687353 resourceVersion: "23152" uid: 6d3ce35b-6242-43b3-b229-c4e2d518c5df spec: containers: - args: - serve - /extracted-catalog/catalog - --cache-dir=/extracted-catalog/cache command: - /bin/opm env: - name: GOMEMLIMIT value: 30MiB image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:b7746cf8d8e9b07b266a540cc52c25ca925c7549c86a55fa61a2cfbdc5db3881 imagePullPolicy: Always livenessProbe: exec: command: - grpc_health_probe - -addr=:50051 failureThreshold: 3 initialDelaySeconds: 10 periodSeconds: 10 successThreshold: 1 timeoutSeconds: 5 name: registry-server ports: - containerPort: 50051 name: grpc protocol: TCP readinessProbe: exec: command: - grpc_health_probe - -addr=:50051 failureThreshold: 3 initialDelaySeconds: 5 periodSeconds: 10 successThreshold: 1 timeoutSeconds: 5 resources: requests: cpu: 10m memory: 30Mi securityContext: allowPrivilegeEscalation: false capabilities: drop: - ALL readOnlyRootFilesystem: false runAsNonRoot: true runAsUser: 1000200000 startupProbe: exec: command: - grpc_health_probe - -addr=:50051 failureThreshold: 10 periodSeconds: 10 successThreshold: 1 timeoutSeconds: 5 terminationMessagePath: /dev/termination-log terminationMessagePolicy: FallbackToLogsOnError volumeMounts: - mountPath: /extracted-catalog name: catalog-content - mountPath: /var/run/secrets/kubernetes.io/serviceaccount name: kube-api-access-wpzzz readOnly: true dnsPolicy: ClusterFirst enableServiceLinks: true imagePullSecrets: - name: redhat-operators-dockercfg-gktm2 initContainers: - args: - /bin/copy-content - /utilities/copy-content command: - cp image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:d8130ca61df1208511321f77285d8b37d9909ecd5179d9191046db978e53b305 imagePullPolicy: IfNotPresent name: extract-utilities resources: {} securityContext: allowPrivilegeEscalation: false capabilities: drop: - ALL runAsNonRoot: true runAsUser: 1000200000 terminationMessagePath: /dev/termination-log terminationMessagePolicy: File volumeMounts: - mountPath: /utilities name: utilities - mountPath: /var/run/secrets/kubernetes.io/serviceaccount name: kube-api-access-wpzzz readOnly: true - args: - --catalog.from=/configs - --catalog.to=/extracted-catalog/catalog - --cache.from=/tmp/cache - --cache.to=/extracted-catalog/cache command: - /utilities/copy-content image: registry.redhat.io/redhat/redhat-operator-index:v4.16 imagePullPolicy: Always name: extract-content resources: {} securityContext: allowPrivilegeEscalation: false capabilities: drop: - ALL runAsNonRoot: true runAsUser: 1000200000 terminationMessagePath: /dev/termination-log terminationMessagePolicy: File volumeMounts: - mountPath: /utilities name: utilities - mountPath: /extracted-catalog name: catalog-content - mountPath: /var/run/secrets/kubernetes.io/serviceaccount name: kube-api-access-wpzzz readOnly: true nodeName: ip-10-0-6-204.ap-south-1.compute.internal nodeSelector: kubernetes.io/os: linux node-role.kubernetes.io/master: "" preemptionPolicy: PreemptLowerPriority priority: 2000000000 priorityClassName: system-cluster-critical restartPolicy: Always schedulerName: default-scheduler securityContext: fsGroup: 1000200000 seLinuxOptions: level: s0:c14,c9 seccompProfile: type: RuntimeDefault serviceAccount: redhat-operators serviceAccountName: redhat-operators terminationGracePeriodSeconds: 30 tolerations: - effect: NoSchedule key: node-role.kubernetes.io/master operator: Exists - effect: NoExecute key: node.kubernetes.io/unreachable operator: Exists tolerationSeconds: 120 - effect: NoExecute key: node.kubernetes.io/not-ready operator: Exists tolerationSeconds: 120 - effect: NoSchedule key: node.kubernetes.io/memory-pressure operator: Exists volumes: - emptyDir: {} name: utilities - emptyDir: {} name: catalog-content - name: kube-api-access-wpzzz projected: defaultMode: 420 sources: - serviceAccountToken: expirationSeconds: 3607 path: token - configMap: items: - key: ca.crt path: ca.crt name: kube-root-ca.crt - downwardAPI: items: - fieldRef: apiVersion: v1 fieldPath: metadata.namespace path: namespace - configMap: items: - key: service-ca.crt path: service-ca.crt name: openshift-service-ca.crt
The custom catalog source and pod(no opm used) as follows,
jiazha-mac:~ jiazha$ oc get catalogsource qe-app-registry -o yaml apiVersion: operators.coreos.com/v1alpha1 kind: CatalogSource metadata: creationTimestamp: "2024-03-26T23:42:57Z" generation: 1 name: qe-app-registry namespace: openshift-marketplace resourceVersion: "107780" uid: 89cada74-0ebb-498f-8bee-6a4a38161fbf spec: displayName: Production Operators image: quay.io/openshift-qe-optional-operators/aosqe-index:v4.16 publisher: OpenShift QE sourceType: grpc updateStrategy: registryPoll: interval: 15m jiazha-mac:~ jiazha$ oc get pods qe-app-registry-jpvzg -o yaml apiVersion: v1 kind: Pod metadata: annotations: cluster-autoscaler.kubernetes.io/safe-to-evict: "true" k8s.ovn.org/pod-networks: '{"default":{"ip_addresses":["10.128.2.17/23"],"mac_address":"0a:58:0a:80:02:11","gateway_ips":["10.128.2.1"],"routes":[{"dest":"10.128.0.0/14","nextHop":"10.128.2.1"},{"dest":"172.30.0.0/16","nextHop":"10.128.2.1"},{"dest":"100.64.0.0/16","nextHop":"10.128.2.1"}],"ip_address":"10.128.2.17/23","gateway_ip":"10.128.2.1"}}' k8s.v1.cni.cncf.io/network-status: |- [{ "name": "ovn-kubernetes", "interface": "eth0", "ips": [ "10.128.2.17" ], "mac": "0a:58:0a:80:02:11", "default": true, "dns": {} }] openshift.io/scc: anyuid creationTimestamp: "2024-03-26T23:42:57Z" generateName: qe-app-registry- labels: olm.catalogSource: qe-app-registry olm.managed: "true" olm.pod-spec-hash: cwmPv14os6LlJNsjYZfFAZLLnNXTLg8mhbOKAq name: qe-app-registry-jpvzg namespace: openshift-marketplace ownerReferences: - apiVersion: operators.coreos.com/v1alpha1 blockOwnerDeletion: false controller: true kind: CatalogSource name: qe-app-registry uid: 89cada74-0ebb-498f-8bee-6a4a38161fbf resourceVersion: "33744" uid: 6c02b66b-5cb8-4c05-be27-d0e97fbb75ea spec: containers: - image: quay.io/openshift-qe-optional-operators/aosqe-index:v4.16 imagePullPolicy: Always livenessProbe: exec: command: - grpc_health_probe - -addr=:50051 failureThreshold: 3 initialDelaySeconds: 10 periodSeconds: 10 successThreshold: 1 timeoutSeconds: 5 name: registry-server ports: - containerPort: 50051 name: grpc protocol: TCP readinessProbe: exec: command: - grpc_health_probe - -addr=:50051 failureThreshold: 3 initialDelaySeconds: 5 periodSeconds: 10 successThreshold: 1 timeoutSeconds: 5 resources: requests: cpu: 10m memory: 50Mi securityContext: capabilities: drop: - MKNOD readOnlyRootFilesystem: false startupProbe: exec: command: - grpc_health_probe - -addr=:50051 failureThreshold: 10 periodSeconds: 10 successThreshold: 1 timeoutSeconds: 5 terminationMessagePath: /dev/termination-log terminationMessagePolicy: FallbackToLogsOnError volumeMounts: - mountPath: /var/run/secrets/kubernetes.io/serviceaccount name: kube-api-access-f52xw readOnly: true dnsPolicy: ClusterFirst enableServiceLinks: true imagePullSecrets: - name: qe-app-registry-dockercfg-k92ct nodeName: ip-10-0-21-197.ap-south-1.compute.internal nodeSelector: kubernetes.io/os: linux preemptionPolicy: PreemptLowerPriority priority: 0 restartPolicy: Always schedulerName: default-scheduler securityContext: seLinuxOptions: level: s0:c14,c9 serviceAccount: qe-app-registry serviceAccountName: qe-app-registry terminationGracePeriodSeconds: 30 tolerations: - effect: NoExecute key: node.kubernetes.io/not-ready operator: Exists tolerationSeconds: 300 - effect: NoExecute key: node.kubernetes.io/unreachable operator: Exists tolerationSeconds: 300 - effect: NoSchedule key: node.kubernetes.io/memory-pressure operator: Exists volumes: - name: kube-api-access-f52xw projected: defaultMode: 420 sources: - serviceAccountToken: expirationSeconds: 3607 path: token - configMap: items: - key: ca.crt path: ca.crt name: kube-root-ca.crt - downwardAPI: items: - fieldRef: apiVersion: v1 fieldPath: metadata.namespace path: namespace - configMap: items: - key: service-ca.crt path: service-ca.crt name: openshift-service-ca.crt
We also failed to simulate the default catalog source using the custom one.
Version-Release number of selected component (if applicable):
4.15.z, Cluster version is 4.16.0-0.nightly-2024-03-25-100907
How reproducible:
always
Steps to Reproduce:
1. Install OCP 4.16. 2. Create a custom catalog source by specificing the `grpcPodConfig`, like jiazha-mac:~ jiazha$ oc get catalogsource jian-registry -o yaml apiVersion: operators.coreos.com/v1alpha1 kind: CatalogSource metadata: creationTimestamp: "2024-03-27T03:29:58Z" generation: 1 name: jian-registry namespace: openshift-marketplace resourceVersion: "112800" uid: 1348322c-d54d-4cc6-aeb2-b5d7b5ec3a09 spec: displayName: Jian Operators grpcPodConfig: extractContent: cacheDir: /tmp/cache catalogDir: /configs memoryTarget: 30Mi securityContextConfig: legacy image: quay.io/openshift-qe-optional-operators/aosqe-index:v4.16 publisher: OpenShift QE sourceType: grpc updateStrategy: registryPoll: interval: 15m 3. Check its pod.
Actual results:
Pod used the `opm` mechanism, but crashed.
jiazha-mac:~ jiazha$ oc get pods jian-registry-jsjl9 -o yaml apiVersion: v1 kind: Pod metadata: annotations: cluster-autoscaler.kubernetes.io/safe-to-evict: "true" k8s.ovn.org/pod-networks: '{"default":{"ip_addresses":["10.128.2.54/23"],"mac_address":"0a:58:0a:80:02:36","gateway_ips":["10.128.2.1"],"routes":[{"dest":"10.128.0.0/14","nextHop":"10.128.2.1"},{"dest":"172.30.0.0/16","nextHop":"10.128.2.1"},{"dest":"100.64.0.0/16","nextHop":"10.128.2.1"}],"ip_address":"10.128.2.54/23","gateway_ip":"10.128.2.1"}}' k8s.v1.cni.cncf.io/network-status: |- [{ "name": "ovn-kubernetes", "interface": "eth0", "ips": [ "10.128.2.54" ], "mac": "0a:58:0a:80:02:36", "default": true, "dns": {} }] openshift.io/scc: anyuid creationTimestamp: "2024-03-27T03:29:59Z" generateName: jian-registry- labels: olm.catalogSource: jian-registry olm.managed: "true" olm.pod-spec-hash: 2Ip5p8sLTUMPUmaKWOicyZKpQ11vnA523DUWcQ name: jian-registry-jsjl9 namespace: openshift-marketplace ownerReferences: - apiVersion: operators.coreos.com/v1alpha1 blockOwnerDeletion: false controller: true kind: CatalogSource name: jian-registry uid: 1348322c-d54d-4cc6-aeb2-b5d7b5ec3a09 resourceVersion: "112862" uid: cc2eac65-9f79-4cf7-b763-98852a7bda9a spec: containers: - args: - serve - /extracted-catalog/catalog - --cache-dir=/extracted-catalog/cache command: - /bin/opm env: - name: GOMEMLIMIT value: 30MiB image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:b7746cf8d8e9b07b266a540cc52c25ca925c7549c86a55fa61a2cfbdc5db3881 imagePullPolicy: Always livenessProbe: exec: command: - grpc_health_probe - -addr=:50051 failureThreshold: 3 initialDelaySeconds: 10 periodSeconds: 10 successThreshold: 1 timeoutSeconds: 5 name: registry-server ports: - containerPort: 50051 name: grpc protocol: TCP readinessProbe: exec: command: - grpc_health_probe - -addr=:50051 failureThreshold: 3 initialDelaySeconds: 5 periodSeconds: 10 successThreshold: 1 timeoutSeconds: 5 resources: requests: cpu: 10m memory: 30Mi securityContext: capabilities: drop: - MKNOD readOnlyRootFilesystem: false startupProbe: exec: command: - grpc_health_probe - -addr=:50051 failureThreshold: 10 periodSeconds: 10 successThreshold: 1 timeoutSeconds: 5 terminationMessagePath: /dev/termination-log terminationMessagePolicy: FallbackToLogsOnError volumeMounts: - mountPath: /extracted-catalog name: catalog-content - mountPath: /var/run/secrets/kubernetes.io/serviceaccount name: kube-api-access-7fb48 readOnly: true dnsPolicy: ClusterFirst enableServiceLinks: true imagePullSecrets: - name: jian-registry-dockercfg-dmqqd initContainers: - args: - /bin/copy-content - /utilities/copy-content command: - cp image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:d8130ca61df1208511321f77285d8b37d9909ecd5179d9191046db978e53b305 imagePullPolicy: IfNotPresent name: extract-utilities resources: {} securityContext: capabilities: drop: - MKNOD terminationMessagePath: /dev/termination-log terminationMessagePolicy: File volumeMounts: - mountPath: /utilities name: utilities - mountPath: /var/run/secrets/kubernetes.io/serviceaccount name: kube-api-access-7fb48 readOnly: true - args: - --catalog.from=/configs - --catalog.to=/extracted-catalog/catalog - --cache.from=/tmp/cache - --cache.to=/extracted-catalog/cache command: - /utilities/copy-content image: quay.io/openshift-qe-optional-operators/aosqe-index:v4.16 imagePullPolicy: Always name: extract-content resources: {} securityContext: capabilities: drop: - MKNOD terminationMessagePath: /dev/termination-log terminationMessagePolicy: File volumeMounts: - mountPath: /utilities name: utilities - mountPath: /extracted-catalog name: catalog-content - mountPath: /var/run/secrets/kubernetes.io/serviceaccount name: kube-api-access-7fb48 readOnly: true nodeName: ip-10-0-21-197.ap-south-1.compute.internal nodeSelector: kubernetes.io/os: linux preemptionPolicy: PreemptLowerPriority priority: 0 restartPolicy: Always schedulerName: default-scheduler securityContext: seLinuxOptions: level: s0:c14,c9 serviceAccount: jian-registry serviceAccountName: jian-registry terminationGracePeriodSeconds: 30 tolerations: - effect: NoExecute key: node.kubernetes.io/not-ready operator: Exists tolerationSeconds: 300 - effect: NoExecute key: node.kubernetes.io/unreachable operator: Exists tolerationSeconds: 300 - effect: NoSchedule key: node.kubernetes.io/memory-pressure operator: Exists volumes: - emptyDir: {} name: utilities - emptyDir: {} name: catalog-content - name: kube-api-access-7fb48 projected: defaultMode: 420 sources: - serviceAccountToken: expirationSeconds: 3607 path: token - configMap: items: - key: ca.crt path: ca.crt name: kube-root-ca.crt - downwardAPI: items: - fieldRef: apiVersion: v1 fieldPath: metadata.namespace path: namespace - configMap: items: - key: service-ca.crt path: service-ca.crt name: openshift-service-ca.crt status: conditions: - lastProbeTime: null lastTransitionTime: "2024-03-27T03:30:00Z" status: "True" type: PodReadyToStartContainers - lastProbeTime: null lastTransitionTime: "2024-03-27T03:29:59Z" message: 'containers with incomplete status: [extract-content]' reason: ContainersNotInitialized status: "False" type: Initialized - lastProbeTime: null lastTransitionTime: "2024-03-27T03:29:59Z" message: 'containers with unready status: [registry-server]' reason: ContainersNotReady status: "False" type: Ready - lastProbeTime: null lastTransitionTime: "2024-03-27T03:29:59Z" message: 'containers with unready status: [registry-server]' reason: ContainersNotReady status: "False" type: ContainersReady - lastProbeTime: null lastTransitionTime: "2024-03-27T03:29:59Z" status: "True" type: PodScheduled containerStatuses: - image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:b7746cf8d8e9b07b266a540cc52c25ca925c7549c86a55fa61a2cfbdc5db3881 imageID: "" lastState: {} name: registry-server ready: false restartCount: 0 started: false state: waiting: reason: PodInitializing hostIP: 10.0.21.197 hostIPs: - ip: 10.0.21.197 initContainerStatuses: - containerID: cri-o://e34ff15442b27a4b8beb7f3761e5249a85f3f9931178472fba48aaacfb81ba01 image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:d8130ca61df1208511321f77285d8b37d9909ecd5179d9191046db978e53b305 imageID: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:d8130ca61df1208511321f77285d8b37d9909ecd5179d9191046db978e53b305 lastState: {} name: extract-utilities ready: true restartCount: 0 started: false state: terminated: containerID: cri-o://e34ff15442b27a4b8beb7f3761e5249a85f3f9931178472fba48aaacfb81ba01 exitCode: 0 finishedAt: "2024-03-27T03:29:59Z" reason: Completed startedAt: "2024-03-27T03:29:59Z" - containerID: cri-o://0d565f275c2e892fe11d7b10a71754ab883bb973b902d8435ac30fb2f743c148 image: quay.io/openshift-qe-optional-operators/aosqe-index:v4.16 imageID: quay.io/openshift-qe-optional-operators/aosqe-index@sha256:4aca119dfd622faf749ecb5802c1573c56bbbd0f9243264314b4163d6b4ce730 lastState: terminated: containerID: cri-o://0d565f275c2e892fe11d7b10a71754ab883bb973b902d8435ac30fb2f743c148 exitCode: 1 finishedAt: "2024-03-27T03:46:26Z" reason: Error startedAt: "2024-03-27T03:46:26Z" name: extract-content ready: false restartCount: 8 started: false state: waiting: message: back-off 5m0s restarting failed container=extract-content pod=jian-registry-jsjl9_openshift-marketplace(cc2eac65-9f79-4cf7-b763-98852a7bda9a) reason: CrashLoopBackOff phase: Pending podIP: 10.128.2.54 podIPs: - ip: 10.128.2.54 qosClass: Burstable startTime: "2024-03-27T03:29:59Z" jiazha-mac:~ jiazha$ oc get pods NAME READY STATUS RESTARTS AGE 485ce4713edf99130b704ed68841227e6ee426e840a75e919b1d4eb062nv4jp 0/1 Completed 0 4h6m 644a61b2f875ca12ddd8db333293c21fcaa746b674f84b0f6c89cf486b8hk2w 0/1 Completed 0 4h6m 9e154ba7939b06c9a53b63282cc8b93dc087872cf969e32ffb8f66fb0f58kc5 0/1 Completed 0 4h6m certified-operators-lqxf6 1/1 Running 0 4h25m community-operators-qzrnr 1/1 Running 0 4h25m jian-registry-84nbl 0/1 Init:CrashLoopBackOff 5 (56s ago) 3m57s jian-registry-jsjl9 0/1 Init:CrashLoopBackOff 8 (3m58s ago) 20m marketplace-operator-75fb497966-w4gh5 1/1 Running 1 (4h25m ago) 4h36m qe-app-registry-jpvzg 1/1 Running 0 4h7m redhat-marketplace-q9zkd 1/1 Running 0 4h24m redhat-operators-cl5p7 1/1 Running 0 4h24m jiazha-mac:~ jiazha$ oc logs jian-registry-jsjl9 Defaulted container "registry-server" out of: registry-server, extract-utilities (init), extract-content (init) Error from server (BadRequest): container "registry-server" in pod "jian-registry-jsjl9" is waiting to start: PodInitializing
Expected results:
The custom catalog source should use the `opm` mechanism as default as the same as the default one.
Additional info:
- is caused by
-
OPRUN-3130 Allow Consuming OPM From The Payload, Not The Index Image
- To Do