Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-4128

catalog-operator fatal error: concurrent map writes

    XMLWordPrintable

Details

    • Bug
    • Resolution: Done
    • Undefined
    • 4.10.z
    • 4.10.z
    • OLM

    Description

      This bug is a backport clone of [Bugzilla Bug 2117324](https://bugzilla.redhat.com/show_bug.cgi?id=2117324). The following is the description of the original bug:

      +++ This bug was initially created as a clone of Bug #2101357 +++

      Description of problem:

      message: "her.go:105 +0xe5\ncreated by k8s.io/apimachinery/pkg/watch.NewStreamWatcher\n\t/build/vendor/k8s.io/apimachinery/pkg/watch/streamwatcher.go:76
      +0x130\n\ngoroutine 5545 [select, 7 minutes]:\ngolang.org/x/net/http2.(*clientStream).writeRequest(0xc00240a780,
      0xc003321a00)\n\t/build/vendor/golang.org/x/net/http2/transport.go:1345
      +0x9c9\ngolang.org/x/net/http2.(*clientStream).doRequest(0xc002efea80?,
      0xc0009cc7a0?)\n\t/build/vendor/golang.org/x/net/http2/transport.go:1207
      +0x1e\ncreated by golang.org/x/net/http2.(*ClientConn).RoundTrip\n\t/build/vendor/golang.org/x/net/http2/transport.go:1136
      +0x30a\n\ngoroutine 5678 [select, 3 minutes]:\ngolang.org/x/net/http2.(*clientStream).writeRequest(0xc000b70480,
      0xc0035d4500)\n\t/build/vendor/golang.org/x/net/http2/transport.go:1345
      +0x9c9\ngolang.org/x/net/http2.(*clientStream).doRequest(0x6e5326?, 0xc002999e90?)\n\t/build/vendor/golang.org/x/net/http2/transport.go:1207
      +0x1e\ncreated by golang.org/x/net/http2.(*ClientConn).RoundTrip\n\t/build/vendor/golang.org/x/net/http2/transport.go:1136
      +0x30a\n\ngoroutine 5836 [select, 1 minutes]:\ngolang.org/x/net/http2.(*clientStream).writeRequest(0xc003b00180,
      0xc003ff8a00)\n\t/build/vendor/golang.org/x/net/http2/transport.go:1345
      +0x9c9\ngolang.org/x/net/http2.(*clientStream).doRequest(0x6e5326?, 0xc003a1c8d0?)\n\t/build/vendor/golang.org/x/net/http2/transport.go:1207
      +0x1e\ncreated by golang.org/x/net/http2.(*ClientConn).RoundTrip\n\t/build/vendor/golang.org/x/net/http2/transport.go:1136
      +0x30a\n\ngoroutine 5905 [chan receive, 1 minutes]:\ngithub.com/operator-framework/operator-lifecycle-manager/pkg/controller/registry/resolver.(*sourceInvalidator).GetValidChannel.func1()\n\t/build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/controller/registry/resolver/source_registry.go:51
      +0x85\ncreated by github.com/operator-framework/operator-lifecycle-manager/pkg/controller/registry/resolver.(*sourceInvalidator).GetValidChannel\n\t/build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/controller/registry/resolver/source_registry.go:50
      +0x231\n"
      reason: Error
      startedAt: "2022-06-27T00:00:59Z"

      Version-Release number of selected component (if applicable):
      mac:~ jianzhang$ oc exec catalog-operator-66cb8fd8c5-j7vkx – olm --version
      OLM version: 0.19.0
      git commit: 8c2bd46147a90d58e98de73d34fd79477769f11f
      mac:~ jianzhang$ oc get clusterversion
      NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
      version 4.11.0-0.nightly-2022-06-25-081133 True False 10h Cluster version is 4.11.0-0.nightly-2022-06-25-081133

      How reproducible:
      always

      Steps to Reproduce:
      1. Install OCP 4.11
      2. Check OLM pods

      Actual results:

      mac:~ jianzhang$ oc get pods
      NAME READY STATUS RESTARTS AGE
      catalog-operator-66cb8fd8c5-j7vkx 1/1 Running 2 (8h ago) 10h
      collect-profiles-27605340-wgsvf 0/1 Completed 0 42m
      collect-profiles-27605355-ffgxd 0/1 Completed 0 27m
      collect-profiles-27605370-w7ds7 0/1 Completed 0 12m
      olm-operator-6cfd444b8f-r5q4t 1/1 Running 0 10h
      package-server-manager-66589d4bf8-csr7j 1/1 Running 0 10h
      packageserver-59977db6cf-nkn5w 1/1 Running 0 10h
      packageserver-59977db6cf-nxbnx 1/1 Running 0 10h

      mac:~ jianzhang$ oc get pods catalog-operator-66cb8fd8c5-j7vkx -o yaml
      apiVersion: v1
      kind: Pod
      metadata:
      annotations:
      k8s.v1.cni.cncf.io/network-status: |-
      [{
      "name": "openshift-sdn",
      "interface": "eth0",
      "ips": [
      "10.130.0.26"
      ],
      "default": true,
      "dns": {}
      }]
      k8s.v1.cni.cncf.io/networks-status: |-
      [{
      "name": "openshift-sdn",
      "interface": "eth0",
      "ips": [
      "10.130.0.26"
      ],
      "default": true,
      "dns": {}
      }]
      openshift.io/scc: nonroot-v2
      seccomp.security.alpha.kubernetes.io/pod: runtime/default
      creationTimestamp: "2022-06-26T23:12:45Z"
      generateName: catalog-operator-66cb8fd8c5-
      labels:
      app: catalog-operator
      pod-template-hash: 66cb8fd8c5
      name: catalog-operator-66cb8fd8c5-j7vkx
      namespace: openshift-operator-lifecycle-manager
      ownerReferences:

      • apiVersion: apps/v1
        blockOwnerDeletion: true
        controller: true
        kind: ReplicaSet
        name: catalog-operator-66cb8fd8c5
        uid: bcf173be-97bc-4152-8cec-d45f820e167c
        resourceVersion: "67395"
        uid: bb34c37b-b22d-4412-bbc0-1e57a1b2bd3a
        spec:
        containers:
      • args:
      • --namespace
      • openshift-marketplace
      • --configmapServerImage=quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:033bd57b54bb2e827b0ae95a03cb6f7ceeb65422e85de635185cfed88c17b2b4
      • --opmImage=quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:033bd57b54bb2e827b0ae95a03cb6f7ceeb65422e85de635185cfed88c17b2b4
      • --util-image
      • quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:8d42dfe153a5bfa3a66e5e9d4664b4939aed376b4afbd9e098291c259daca2f8
      • --writeStatusName
      • operator-lifecycle-manager-catalog
      • --tls-cert
      • /srv-cert/tls.crt
      • --tls-key
      • /srv-cert/tls.key
      • --client-ca
      • /profile-collector-cert/tls.crt
        command:
      • /bin/catalog
        env:
      • name: RELEASE_VERSION
        value: 4.11.0-0.nightly-2022-06-25-081133
        image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:8d42dfe153a5bfa3a66e5e9d4664b4939aed376b4afbd9e098291c259daca2f8
        imagePullPolicy: IfNotPresent
        livenessProbe:
        failureThreshold: 3
        httpGet:
        path: /healthz
        port: 8443
        scheme: HTTPS
        periodSeconds: 10
        successThreshold: 1
        timeoutSeconds: 1
        name: catalog-operator
        ports:
      • containerPort: 8443
        name: metrics
        protocol: TCP
        readinessProbe:
        failureThreshold: 3
        httpGet:
        path: /healthz
        port: 8443
        scheme: HTTPS
        periodSeconds: 10
        successThreshold: 1
        timeoutSeconds: 1
        resources:
        requests:
        cpu: 10m
        memory: 80Mi
        securityContext:
        allowPrivilegeEscalation: false
        capabilities:
        drop:
      • ALL
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: FallbackToLogsOnError
        volumeMounts:
      • mountPath: /srv-cert
        name: srv-cert
        readOnly: true
      • mountPath: /profile-collector-cert
        name: profile-collector-cert
        readOnly: true
      • mountPath: /var/run/secrets/kubernetes.io/serviceaccount
        name: kube-api-access-7txv8
        readOnly: true
        dnsPolicy: ClusterFirst
        enableServiceLinks: true
        nodeName: ip-10-0-173-128.ap-southeast-1.compute.internal
        nodeSelector:
        kubernetes.io/os: linux
        node-role.kubernetes.io/master: ""
        preemptionPolicy: PreemptLowerPriority
        priority: 2000000000
        priorityClassName: system-cluster-critical
        restartPolicy: Always
        schedulerName: default-scheduler
        securityContext:
        runAsNonRoot: true
        runAsUser: 65534
        seLinuxOptions:
        level: s0:c20,c0
        seccompProfile:
        type: RuntimeDefault
        serviceAccount: olm-operator-serviceaccount
        serviceAccountName: olm-operator-serviceaccount
        terminationGracePeriodSeconds: 30
        tolerations:
      • effect: NoSchedule
        key: node-role.kubernetes.io/master
        operator: Exists
      • effect: NoExecute
        key: node.kubernetes.io/unreachable
        operator: Exists
        tolerationSeconds: 120
      • effect: NoExecute
        key: node.kubernetes.io/not-ready
        operator: Exists
        tolerationSeconds: 120
      • effect: NoSchedule
        key: node.kubernetes.io/memory-pressure
        operator: Exists
        volumes:
      • name: srv-cert
        secret:
        defaultMode: 420
        secretName: catalog-operator-serving-cert
      • name: profile-collector-cert
        secret:
        defaultMode: 420
        secretName: pprof-cert
      • name: kube-api-access-7txv8
        projected:
        defaultMode: 420
        sources:
      • serviceAccountToken:
        expirationSeconds: 3607
        path: token
      • configMap:
        items:
      • key: ca.crt
        path: ca.crt
        name: kube-root-ca.crt
      • downwardAPI:
        items:
      • fieldRef:
        apiVersion: v1
        fieldPath: metadata.namespace
        path: namespace
      • configMap:
        items:
      • key: service-ca.crt
        path: service-ca.crt
        name: openshift-service-ca.crt
        status:
        conditions:
      • lastProbeTime: null
        lastTransitionTime: "2022-06-26T23:16:59Z"
        status: "True"
        type: Initialized
      • lastProbeTime: null
        lastTransitionTime: "2022-06-27T01:19:09Z"
        status: "True"
        type: Ready
      • lastProbeTime: null
        lastTransitionTime: "2022-06-27T01:19:09Z"
        status: "True"
        type: ContainersReady
      • lastProbeTime: null
        lastTransitionTime: "2022-06-26T23:16:59Z"
        status: "True"
        type: PodScheduled
        containerStatuses:
      • containerID: cri-o://f3fce12480556b5ab279236c290acdf15e1bc850426078730dcfd333ecda6795
        image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:8d42dfe153a5bfa3a66e5e9d4664b4939aed376b4afbd9e098291c259daca2f8
        imageID: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:8d42dfe153a5bfa3a66e5e9d4664b4939aed376b4afbd9e098291c259daca2f8
        lastState:
        terminated:
        containerID: cri-o://5e3ce04f1cf538f679af008dcb62f9477131aae1772f527d6207fdc7bb247a55
        exitCode: 2
        finishedAt: "2022-06-27T01:19:08Z"
        message: "her.go:105 +0xe5\ncreated by k8s.io/apimachinery/pkg/watch.NewStreamWatcher\n\t/build/vendor/k8s.io/apimachinery/pkg/watch/streamwatcher.go:76
        +0x130\n\ngoroutine 5545 [select, 7 minutes]:\ngolang.org/x/net/http2.(*clientStream).writeRequest(0xc00240a780,
        0xc003321a00)\n\t/build/vendor/golang.org/x/net/http2/transport.go:1345
        +0x9c9\ngolang.org/x/net/http2.(*clientStream).doRequest(0xc002efea80?,
        0xc0009cc7a0?)\n\t/build/vendor/golang.org/x/net/http2/transport.go:1207
        +0x1e\ncreated by golang.org/x/net/http2.(*ClientConn).RoundTrip\n\t/build/vendor/golang.org/x/net/http2/transport.go:1136
        +0x30a\n\ngoroutine 5678 [select, 3 minutes]:\ngolang.org/x/net/http2.(*clientStream).writeRequest(0xc000b70480,
        0xc0035d4500)\n\t/build/vendor/golang.org/x/net/http2/transport.go:1345
        +0x9c9\ngolang.org/x/net/http2.(*clientStream).doRequest(0x6e5326?, 0xc002999e90?)\n\t/build/vendor/golang.org/x/net/http2/transport.go:1207
        +0x1e\ncreated by golang.org/x/net/http2.(*ClientConn).RoundTrip\n\t/build/vendor/golang.org/x/net/http2/transport.go:1136
        +0x30a\n\ngoroutine 5836 [select, 1 minutes]:\ngolang.org/x/net/http2.(*clientStream).writeRequest(0xc003b00180,
        0xc003ff8a00)\n\t/build/vendor/golang.org/x/net/http2/transport.go:1345
        +0x9c9\ngolang.org/x/net/http2.(*clientStream).doRequest(0x6e5326?, 0xc003a1c8d0?)\n\t/build/vendor/golang.org/x/net/http2/transport.go:1207
        +0x1e\ncreated by golang.org/x/net/http2.(*ClientConn).RoundTrip\n\t/build/vendor/golang.org/x/net/http2/transport.go:1136
        +0x30a\n\ngoroutine 5905 [chan receive, 1 minutes]:\ngithub.com/operator-framework/operator-lifecycle-manager/pkg/controller/registry/resolver.(*sourceInvalidator).GetValidChannel.func1()\n\t/build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/controller/registry/resolver/source_registry.go:51
        +0x85\ncreated by github.com/operator-framework/operator-lifecycle-manager/pkg/controller/registry/resolver.(*sourceInvalidator).GetValidChannel\n\t/build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/controller/registry/resolver/source_registry.go:50
        +0x231\n"
        reason: Error
        startedAt: "2022-06-27T00:00:59Z"
        name: catalog-operator
        ready: true
        restartCount: 2
        started: true
        state:
        running:
        startedAt: "2022-06-27T01:19:08Z"
        hostIP: 10.0.173.128
        phase: Running
        podIP: 10.130.0.26
        podIPs:
      • ip: 10.130.0.26
        qosClass: Burstable
        startTime: "2022-06-26T23:16:59Z"

      Expected results:
      catalog-operator works well.

      Additional info:
      Operators can be subscribed successfully.
      mac:~ jianzhang$ oc get sub -A
      NAMESPACE NAME PACKAGE SOURCE CHANNEL
      jian learn learn qe-app-registry beta
      openshift-logging cluster-logging cluster-logging qe-app-registry stable
      openshift-operators-redhat elasticsearch-operator elasticsearch-operator qe-app-registry stable
      mac:~ jianzhang$
      mac:~ jianzhang$ oc get pods -n jian
      NAME READY STATUS RESTARTS AGE
      552b4660850a7fe1e1f142091eb5e4305f18af151727c56f70aa5dffc1dg8cg 0/1 Completed 0 54m
      learn-operator-666b687bfb-7qppm 1/1 Running 0 54m
      qe-app-registry-hbzxg 1/1 Running 0 58m
      mac:~ jianzhang$ oc get csv -n jian
      NAME DISPLAY VERSION REPLACES PHASE
      elasticsearch-operator.v5.5.0 OpenShift Elasticsearch Operator 5.5.0 Succeeded
      learn-operator.v0.0.3 Learn Operator 0.0.3 learn-operator.v0.0.2 Succeeded

      — Additional comment from jiazha@redhat.com on 2022-06-27 09:58:18 UTC —

      Created attachment 1892927
      olm must-gather

      — Additional comment from jiazha@redhat.com on 2022-06-27 09:59:01 UTC —

      Created attachment 1892928
      marketplace project must-gather

      — Additional comment from jiazha@redhat.com on 2022-06-28 02:05:39 UTC —

      mac:~ jianzhang$ oc get clusterversion
      NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
      version 4.11.0-0.nightly-2022-06-25-132614 True False 145m Cluster version is 4.11.0-0.nightly-2022-06-25-132614

      mac:~ jianzhang$ oc get pods
      NAME READY STATUS RESTARTS AGE
      catalog-operator-869fb4bd4d-lbhgj 1/1 Running 3 (9m25s ago) 170m
      collect-profiles-27606330-4wg5r 0/1 Completed 0 33m
      collect-profiles-27606345-lmk4q 0/1 Completed 0 18m
      collect-profiles-27606360-mksv6 0/1 Completed 0 3m17s
      olm-operator-5f485d9d5f-wczjc 1/1 Running 0 170m
      package-server-manager-6cf996b4cc-79lrw 1/1 Running 2 (156m ago) 170m
      packageserver-5f668f98d7-2vjdn 1/1 Running 0 165m
      packageserver-5f668f98d7-mb2wc 1/1 Running 0 165m
      mac:~ jianzhang$ oc get pods catalog-operator-869fb4bd4d-lbhgj -o yaml
      apiVersion: v1
      kind: Pod
      metadata:
      annotations:
      k8s.v1.cni.cncf.io/network-status: |-
      [{
      "name": "openshift-sdn",
      "interface": "eth0",
      "ips": [
      "10.130.0.34"
      ],
      "default": true,
      "dns": {}
      }]
      k8s.v1.cni.cncf.io/networks-status: |-
      [{
      "name": "openshift-sdn",
      "interface": "eth0",
      "ips": [
      "10.130.0.34"
      ],
      "default": true,
      "dns": {}
      }]
      openshift.io/scc: nonroot-v2
      seccomp.security.alpha.kubernetes.io/pod: runtime/default
      creationTimestamp: "2022-06-27T23:13:12Z"
      generateName: catalog-operator-869fb4bd4d-
      labels:
      app: catalog-operator
      pod-template-hash: 869fb4bd4d
      name: catalog-operator-869fb4bd4d-lbhgj
      namespace: openshift-operator-lifecycle-manager
      ownerReferences:

      • apiVersion: apps/v1
        blockOwnerDeletion: true
        controller: true
        kind: ReplicaSet
        name: catalog-operator-869fb4bd4d
        uid: 3a1a8cd3-2151-4650-a96b-c1951d461c67
        resourceVersion: "75671"
        uid: 2f06d663-0697-4e88-9e9a-f802c6641efe
        spec:
        containers:
      • args:
      • --namespace
      • openshift-marketplace
      • --configmapServerImage=quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:40552deca32531d2fe754f703613f24b514e27ffaa660b57d760f9cd984d9000
      • --opmImage=quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:40552deca32531d2fe754f703613f24b514e27ffaa660b57d760f9cd984d9000
      • --util-image
      • quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:ac69cd6ac451019bc6060627a1cceb475d2b6b521803addb31b47e501d5e5935
      • --writeStatusName
      • operator-lifecycle-manager-catalog
      • --tls-cert
      • /srv-cert/tls.crt
      • --tls-key
      • /srv-cert/tls.key
      • --client-ca
      • /profile-collector-cert/tls.crt
        command:
      • /bin/catalog
        env:
      • name: RELEASE_VERSION
        value: 4.11.0-0.nightly-2022-06-25-132614
        image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:ac69cd6ac451019bc6060627a1cceb475d2b6b521803addb31b47e501d5e5935
        imagePullPolicy: IfNotPresent
        livenessProbe:
        failureThreshold: 3
        httpGet:
        path: /healthz
        port: 8443
        scheme: HTTPS
        periodSeconds: 10
        successThreshold: 1
        timeoutSeconds: 1
        name: catalog-operator
        ports:
      • containerPort: 8443
        name: metrics
        protocol: TCP
        readinessProbe:
        failureThreshold: 3
        httpGet:
        path: /healthz
        port: 8443
        scheme: HTTPS
        periodSeconds: 10
        successThreshold: 1
        timeoutSeconds: 1
        resources:
        requests:
        cpu: 10m
        memory: 80Mi
        securityContext:
        allowPrivilegeEscalation: false
        capabilities:
        drop:
      • ALL
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: FallbackToLogsOnError
        volumeMounts:
      • mountPath: /srv-cert
        name: srv-cert
        readOnly: true
      • mountPath: /profile-collector-cert
        name: profile-collector-cert
        readOnly: true
      • mountPath: /var/run/secrets/kubernetes.io/serviceaccount
        name: kube-api-access-sdjns
        readOnly: true
        dnsPolicy: ClusterFirst
        enableServiceLinks: true
        nodeName: ip-10-0-190-130.ap-south-1.compute.internal
        nodeSelector:
        kubernetes.io/os: linux
        node-role.kubernetes.io/master: ""
        preemptionPolicy: PreemptLowerPriority
        priority: 2000000000
        priorityClassName: system-cluster-critical
        restartPolicy: Always
        schedulerName: default-scheduler
        securityContext:
        runAsNonRoot: true
        runAsUser: 65534
        seLinuxOptions:
        level: s0:c20,c0
        seccompProfile:
        type: RuntimeDefault
        serviceAccount: olm-operator-serviceaccount
        serviceAccountName: olm-operator-serviceaccount
        terminationGracePeriodSeconds: 30
        tolerations:
      • effect: NoSchedule
        key: node-role.kubernetes.io/master
        operator: Exists
      • effect: NoExecute
        key: node.kubernetes.io/unreachable
        operator: Exists
        tolerationSeconds: 120
      • effect: NoExecute
        key: node.kubernetes.io/not-ready
        operator: Exists
        tolerationSeconds: 120
      • effect: NoSchedule
        key: node.kubernetes.io/memory-pressure
        operator: Exists
        volumes:
      • name: srv-cert
        secret:
        defaultMode: 420
        secretName: catalog-operator-serving-cert
      • name: profile-collector-cert
        secret:
        defaultMode: 420
        secretName: pprof-cert
      • name: kube-api-access-sdjns
        projected:
        defaultMode: 420
        sources:
      • serviceAccountToken:
        expirationSeconds: 3607
        path: token
      • configMap:
        items:
      • key: ca.crt
        path: ca.crt
        name: kube-root-ca.crt
      • downwardAPI:
        items:
      • fieldRef:
        apiVersion: v1
        fieldPath: metadata.namespace
        path: namespace
      • configMap:
        items:
      • key: service-ca.crt
        path: service-ca.crt
        name: openshift-service-ca.crt
        status:
        conditions:
      • lastProbeTime: null
        lastTransitionTime: "2022-06-27T23:16:47Z"
        status: "True"
        type: Initialized
      • lastProbeTime: null
        lastTransitionTime: "2022-06-28T01:53:53Z"
        status: "True"
        type: Ready
      • lastProbeTime: null
        lastTransitionTime: "2022-06-28T01:53:53Z"
        status: "True"
        type: ContainersReady
      • lastProbeTime: null
        lastTransitionTime: "2022-06-27T23:16:47Z"
        status: "True"
        type: PodScheduled
        containerStatuses:
      • containerID: cri-o://2a83d3ba503aa760d27aeef28bf03af84b9ff134ea76c6545251f3253507c22b
        image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:ac69cd6ac451019bc6060627a1cceb475d2b6b521803addb31b47e501d5e5935
        imageID: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:ac69cd6ac451019bc6060627a1cceb475d2b6b521803addb31b47e501d5e5935
        lastState:
        terminated:
        containerID: cri-o://2766bb3bad38c02f531915aac0321ffb18c7a993a7d84b27e8c7199935f7c858
        exitCode: 2
        finishedAt: "2022-06-28T01:53:52Z"
        message: "izer/streaming/streaming.go:77 +0xa7\nk8s.io/client-go/rest/watch.(*Decoder).Decode(0xc002b12500)\n\t/build/vendor/k8s.io/client-go/rest/watch/decoder.go:49
        +0x4f\nk8s.io/apimachinery/pkg/watch.(*StreamWatcher).receive(0xc0036178c0)\n\t/build/vendor/k8s.io/apimachinery/pkg/watch/streamwatcher.go:105
        +0xe5\ncreated by k8s.io/apimachinery/pkg/watch.NewStreamWatcher\n\t/build/vendor/k8s.io/apimachinery/pkg/watch/streamwatcher.go:76
        +0x130\n\ngoroutine 3479 [sync.Cond.Wait]:\nsync.runtime_notifyListWait(0xc003a987c8,
        0x8)\n\t/usr/lib/golang/src/runtime/sema.go:513 +0x13d\nsync.(*Cond).Wait(0xc001f9bf10?)\n\t/usr/lib/golang/src/sync/cond.go:56
        +0x8c\ngolang.org/x/net/http2.(*pipe).Read(0xc003a987b0, {0xc0036e4001, 0xdff, 0xdff}

        )\n\t/build/vendor/golang.org/x/net/http2/pipe.go:76 +0xeb\ngolang.org/x/net/http2.transportResponseBody.Read(

        {0x0?}

        ,

        {0xc0036e4001?, 0x2?, 0x203511d?}

        )\n\t/build/vendor/golang.org/x/net/http2/transport.go:2407
        +0x85\nencoding/json.(*Decoder).refill(0xc002fc0640)\n\t/usr/lib/golang/src/encoding/json/stream.go:165
        +0x17f\nencoding/json.(*Decoder).readValue(0xc002fc0640)\n\t/usr/lib/golang/src/encoding/json/stream.go:140
        +0xbb\nencoding/json.(*Decoder).Decode(0xc002fc0640,

        {0x1d377c0, 0xc003523098}

        )\n\t/usr/lib/golang/src/encoding/json/stream.go:63
        +0x78\nk8s.io/apimachinery/pkg/util/framer.(*jsonFrameReader).Read(0xc003127770,

        {0xc0036dd500, 0x1000, 0x1500}

        )\n\t/build/vendor/k8s.io/apimachinery/pkg/util/framer/framer.go:152
        +0x19c\nk8s.io/apimachinery/pkg/runtime/serializer/streaming.(*decoder).Decode(0xc003502aa0,
        0xc001f9bf10?,

        {0x2366870, 0xc0044dca80}

        )\n\t/build/vendor/k8s.io/apimachinery/pkg/runtime/serializer/streaming/streaming.go:77
        +0xa7\nk8s.io/client-go/rest/watch.(*Decoder).Decode(0xc00059f700)\n\t/build/vendor/k8s.io/client-go/rest/watch/decoder.go:49
        +0x4f\nk8s.io/apimachinery/pkg/watch.(*StreamWatcher).receive(0xc0044dcd40)\n\t/build/vendor/k8s.io/apimachinery/pkg/watch/streamwatcher.go:105
        +0xe5\ncreated by k8s.io/apimachinery/pkg/watch.NewStreamWatcher\n\t/build/vendor/k8s.io/apimachinery/pkg/watch/streamwatcher.go:76
        +0x130\n"
        reason: Error
        startedAt: "2022-06-28T01:06:59Z"
        name: catalog-operator
        ready: true
        restartCount: 3
        started: true
        state:
        running:
        startedAt: "2022-06-28T01:53:53Z"
        hostIP: 10.0.190.130
        phase: Running
        podIP: 10.130.0.34
        podIPs:

      • ip: 10.130.0.34
        qosClass: Burstable
        startTime: "2022-06-27T23:16:47Z"

      — Additional comment from jiazha@redhat.com on 2022-06-28 02:09:23 UTC —

      mac:~ jianzhang$ oc get pods package-server-manager-6cf996b4cc-79lrw -o yaml
      apiVersion: v1
      kind: Pod
      metadata:
      annotations:
      k8s.v1.cni.cncf.io/network-status: |-
      [{
      "name": "openshift-sdn",
      "interface": "eth0",
      "ips": [
      "10.130.0.13"
      ],
      "default": true,
      "dns": {}
      }]
      k8s.v1.cni.cncf.io/networks-status: |-
      [{
      "name": "openshift-sdn",
      "interface": "eth0",
      "ips": [
      "10.130.0.13"
      ],
      "default": true,
      "dns": {}
      }]
      openshift.io/scc: nonroot-v2
      seccomp.security.alpha.kubernetes.io/pod: runtime/default
      creationTimestamp: "2022-06-27T23:13:10Z"
      generateName: package-server-manager-6cf996b4cc-
      labels:
      app: package-server-manager
      pod-template-hash: 6cf996b4cc
      name: package-server-manager-6cf996b4cc-79lrw
      namespace: openshift-operator-lifecycle-manager
      ownerReferences:

      • apiVersion: apps/v1
        blockOwnerDeletion: true
        controller: true
        kind: ReplicaSet
        name: package-server-manager-6cf996b4cc
        uid: 191cb627-a1f3-452b-aed2-336de7871004
        resourceVersion: "19056"
        uid: 63cb7200-32d6-4b0d-bfb2-ba725cda7fbd
        spec:
        containers:
      • args:
      • --name
      • $(PACKAGESERVER_NAME)
      • --namespace
      • $(PACKAGESERVER_NAMESPACE)
        command:
      • /bin/psm
      • start
        env:
      • name: PACKAGESERVER_NAME
        value: packageserver
      • name: PACKAGESERVER_IMAGE
        value: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:ac69cd6ac451019bc6060627a1cceb475d2b6b521803addb31b47e501d5e5935
      • name: PACKAGESERVER_NAMESPACE
        valueFrom:
        fieldRef:
        apiVersion: v1
        fieldPath: metadata.namespace
      • name: RELEASE_VERSION
        value: 4.11.0-0.nightly-2022-06-25-132614
        image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:ac69cd6ac451019bc6060627a1cceb475d2b6b521803addb31b47e501d5e5935
        imagePullPolicy: IfNotPresent
        livenessProbe:
        failureThreshold: 3
        httpGet:
        path: /healthz
        port: 8080
        scheme: HTTP
        initialDelaySeconds: 30
        periodSeconds: 10
        successThreshold: 1
        timeoutSeconds: 1
        name: package-server-manager
        readinessProbe:
        failureThreshold: 3
        httpGet:
        path: /healthz
        port: 8080
        scheme: HTTP
        initialDelaySeconds: 30
        periodSeconds: 10
        successThreshold: 1
        timeoutSeconds: 1
        resources:
        requests:
        cpu: 10m
        memory: 50Mi
        securityContext:
        allowPrivilegeEscalation: false
        capabilities:
        drop:
      • ALL
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: FallbackToLogsOnError
        volumeMounts:
      • mountPath: /var/run/secrets/kubernetes.io/serviceaccount
        name: kube-api-access-k9hh2
        readOnly: true
        dnsPolicy: ClusterFirst
        enableServiceLinks: true
        nodeName: ip-10-0-190-130.ap-south-1.compute.internal
        nodeSelector:
        kubernetes.io/os: linux
        node-role.kubern

      — Additional comment from jiazha@redhat.com on 2022-06-28 02:10:02 UTC —

      preemptionPolicy: PreemptLowerPriority
      priority: 2000000000
      priorityClassName: system-cluster-critical
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext:
      runAsNonRoot: true
      runAsUser: 65534
      seLinuxOptions:
      level: s0:c20,c0
      seccompProfile:
      type: RuntimeDefault
      serviceAccount: olm-operator-serviceaccount
      serviceAccountName: olm-operator-serviceaccount
      terminationGracePeriodSeconds: 30
      tolerations:

      • effect: NoSchedule
        key: node-role.kubernetes.io/master
        operator: Exists
      • effect: NoExecute
        key: node.kubernetes.io/unreachable
        operator: Exists
        tolerationSeconds: 120
      • effect: NoExecute
        key: node.kubernetes.io/not-ready
        operator: Exists
        tolerationSeconds: 120
      • effect: NoSchedule
        key: node.kubernetes.io/memory-pressure
        operator: Exists
        volumes:
      • name: kube-api-access-k9hh2
        projected:
        defaultMode: 420
        sources:
      • serviceAccountToken:
        expirationSeconds: 3607
        path: token
      • configMap:
        items:
      • key: ca.crt
        path: ca.crt
        name: kube-root-ca.crt
      • downwardAPI:
        items:
      • fieldRef:
        apiVersion: v1
        fieldPath: metadata.namespace
        path: namespace
      • configMap:
        items:
      • key: service-ca.crt
        path: service-ca.crt
        name: openshift-service-ca.crt
        status:
        conditions:
      • lastProbeTime: null
        lastTransitionTime: "2022-06-27T23:16:47Z"
        status: "True"
        type: Initialized
      • lastProbeTime: null
        lastTransitionTime: "2022-06-27T23:27:28Z"
        status: "True"
        type: Ready
      • lastProbeTime: null
        lastTransitionTime: "2022-06-27T23:27:28Z"
        status: "True"
        type: ContainersReady
      • lastProbeTime: null
        lastTransitionTime: "2022-06-27T23:16:47Z"
        status: "True"
        type: PodScheduled
        containerStatuses:
      • containerID: cri-o://2c56089fabbaa6d7067feb231750ad20422e299dc33dabc0cd19ceca5951ed3a
        image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:ac69cd6ac451019bc6060627a1cceb475d2b6b521803addb31b47e501d5e5935
        imageID: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:ac69cd6ac451019bc6060627a1cceb475d2b6b521803addb31b47e501d5e5935
        lastState:
        terminated:
        containerID: cri-o://76cc5448de346ec636070a71868432c8f3aa213e16fe211c8fe18a3fee112d23
        exitCode: 1
        finishedAt: "2022-06-27T23:26:36Z"
        message: "d/vendor/github.com/spf13/cobra/command.go:856\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\t/build/vendor/github.com/spf13/cobra/command.go:974\ngithub.com/spf13/cobra.(*Command).Execute\n\t/build/vendor/github.com/spf13/cobra/command.go:902\nmain.main\n\t/build/cmd/package-server-manager/main.go:36\nruntime.main\n\t/usr/lib/golang/src/runtime/proc.go:250\n1.6563723963629913e+09\tERROR\tFailed
        to get API Group-Resources\t {\"error\": \"Get \\\"https://172.30.0.1:443/api?timeout=32s\\\": dial tcp 172.30.0.1:443: connect: connection refused\"}

        \nsigs.k8s.io/controller-runtime/pkg/cluster.New\n\t/build/vendor/sigs.k8s.io/controller-runtime/pkg/cluster/cluster.go:160\nsigs.k8s.io/controller-runtime/pkg/manager.New\n\t/build/vendor/sigs.k8s.io/controller-runtime/pkg/manager/manager.go:322\nmain.run\n\t/build/cmd/package-server-manager/main.go:67\ngithub.com/spf13/cobra.(*Command).execute\n\t/build/vendor/github.com/spf13/cobra/command.go:856\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\t/build/vendor/github.com/spf13/cobra/command.go:974\ngithub.com/spf13/cobra.(*Command).Execute\n\t/build/vendor/github.com/spf13/cobra/command.go:902\nmain.main\n\t/build/cmd/package-server-manager/main.go:36\nruntime.main\n\t/usr/lib/golang/src/runtime/proc.go:250\n1.6563723963631017e+09\tERROR\tsetup\tfailed
        to setup manager instance\t

        {\"error\": \"Get \\\"https://172.30.0.1:443/api?timeout=32s\\\": dial tcp 172.30.0.1:443: connect: connection refused\"}

        \ngithub.com/spf13/cobra.(*Command).execute\n\t/build/vendor/github.com/spf13/cobra/command.go:856\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\t/build/vendor/github.com/spf13/cobra/command.go:974\ngithub.com/spf13/cobra.(*Command).Execute\n\t/build/vendor/github.com/spf13/cobra/command.go:902\nmain.main\n\t/build/cmd/package-server-manager/main.go:36\nruntime.main\n\t/usr/lib/golang/src/runtime/proc.go:250\nError:
        Get \"https://172.30.0.1:443/api?timeout=32s\": dial tcp 172.30.0.1:443:
        connect: connection refused\nencountered an error while executing the binary:
        Get \"https://172.30.0.1:443/api?timeout=32s\": dial tcp 172.30.0.1:443:
        connect: connection refused\n"
        reason: Error
        startedAt: "2022-06-27T23:26:11Z"
        name: package-server-manager
        ready: true
        restartCount: 2
        started: true
        state:
        running:
        startedAt: "2022-06-27T23:26:54Z"
        hostIP: 10.0.190.130
        phase: Running
        podIP: 10.130.0.13
        podIPs:

      • ip: 10.130.0.13
        qosClass: Burstable
        startTime: "2022-06-27T23:16:47Z"

      — Additional comment from jiazha@redhat.com on 2022-06-29 08:43:51 UTC —

      Observed the error restarts:

      mac:~ jianzhang$ oc get clusterversion
      NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
      version 4.11.0-0.nightly-2022-06-28-160049 True False 5h57m Cluster version is 4.11.0-0.nightly-2022-06-28-160049

      mac:~ jianzhang$ oc get pods
      NAME READY STATUS RESTARTS AGE
      catalog-operator-7b88dddfbc-rsfhz 1/1 Running 6 (26m ago) 5h51m
      collect-profiles-27608160-6m7r6 0/1 Completed 0 37m
      collect-profiles-27608175-94n56 0/1 Completed 0 22m
      collect-profiles-27608190-nbzcf 0/1 Completed 0 7m55s
      olm-operator-5977ffb855-lgfn8 1/1 Running 0 9h
      package-server-manager-75db6dcfc-hql4v 1/1 Running 0 9h
      packageserver-5955fb79cd-9n56n 1/1 Running 0 9h
      packageserver-5955fb79cd-xf6f6 1/1 Running 0 9h

      mac:~ jianzhang$ oc get pods catalog-operator-7b88dddfbc-rsfhz -o yaml
      apiVersion: v1
      kind: Pod
      metadata:
      annotations:
      k8s.v1.cni.cncf.io/network-status: |-
      [{
      "name": "openshift-sdn",
      "interface": "eth0",
      "ips": [
      "10.130.0.121"
      ],
      "default": true,
      "dns": {}
      }]
      k8s.v1.cni.cncf.io/networks-status: |-
      [{
      "name": "openshift-sdn",
      "interface": "eth0",
      "ips": [
      "10.130.0.121"
      ],
      "default": true,
      "dns": {}
      }]
      openshift.io/scc: nonroot-v2
      seccomp.security.alpha.kubernetes.io/pod: runtime/default
      creationTimestamp: "2022-06-29T02:46:23Z"
      generateName: catalog-operator-7b88dddfbc-
      labels:
      app: catalog-operator
      pod-template-hash: 7b88dddfbc
      name: catalog-operator-7b88dddfbc-rsfhz
      namespace: openshift-operator-lifecycle-manager
      ownerReferences:

      • apiVersion: apps/v1
        blockOwnerDeletion: true
        controller: true
        kind: ReplicaSet
        name: catalog-operator-7b88dddfbc
        uid: 4c796f4d-b2f2-4e0d-b2de-6e8ab5f45ce2
        resourceVersion: "278732"
        uid: 10e8c6e9-8f1c-4484-b934-c5208016c426
        spec:
        containers:
      • args:
      • --debug
      • --namespace
      • openshift-marketplace
      • --configmapServerImage=quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:40552deca32531d2fe754f703613f24b514e27ffaa660b57d760f9cd984d9000
      • --opmImage=quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:40552deca32531d2fe754f703613f24b514e27ffaa660b57d760f9cd984d9000
      • --util-image
      • quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:ac69cd6ac451019bc6060627a1cceb475d2b6b521803addb31b47e501d5e5935
      • --writeStatusName
      • operator-lifecycle-manager-catalog
      • --tls-cert
      • /srv-cert/tls.crt
      • --tls-key
      • /srv-cert/tls.key
      • --client-ca
      • /profile-collector-cert/tls.crt
        command:
      • /bin/catalog
        env:
      • name: RELEASE_VERSION
        value: 4.11.0-0.nightly-2022-06-28-160049
        image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:ac69cd6ac451019bc6060627a1cceb475d2b6b521803addb31b47e501d5e5935
        imagePullPolicy: IfNotPresent
        livenessProbe:
        failureThreshold: 3
        httpGet:
        path: /healthz
        port: 8443
        scheme: HTTPS
        periodSeconds: 10
        successThreshold: 1
        timeoutSeconds: 1
        name: catalog-operator
        ports:
      • containerPort: 8443
        name: metrics
        protocol: TCP
        readinessProbe:
        failureThreshold: 3
        httpGet:
        path: /healthz
        port: 8443
        scheme: HTTPS
        periodSeconds: 10
        successThreshold: 1
        timeoutSeconds: 1
        resources:
        requests:
        cpu: 10m
        memory: 80Mi
        securityContext:
        allowPrivilegeEscalation: false
        capabilities:
        drop:
      • ALL
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: FallbackToLogsOnError
        volumeMounts:
      • mountPath: /srv-cert
        name: srv-cert
        readOnly: true
      • mountPath: /profile-collector-cert
        name: profile-collector-cert
        readOnly: true
      • mountPath: /var/run/secrets/kubernetes.io/serviceaccount
        name: kube-api-access-cv642
        readOnly: true
        dnsPolicy: ClusterFirst
        enableServiceLinks: true
        imagePullSecrets:
      • name: olm-operator-serviceaccount-dockercfg-f26bf
        nodeName: ip-10-0-130-83.ap-south-1.compute.internal
        nodeSelector:
        kubernetes.io/os: linux
        node-role.kubernetes.io/master: ""
        preemptionPolicy: PreemptLowerPriority
        priority: 2000000000
        priorityClassName: system-cluster-critical
        restartPolicy: Always
        schedulerName: default-scheduler
        securityContext:
        runAsNonRoot: true
        runAsUser: 65534
        seLinuxOptions:
        level: s0:c20,c0
        seccompProfile:
        type: RuntimeDefault
        serviceAccount: olm-operator-serviceaccount
        serviceAccountName: olm-operator-serviceaccount
        terminationGracePeriodSeconds: 30
        tolerations:
      • effect: NoSchedule
        key: node-role.kubernetes.io/master
        operator: Exists
      • effect: NoExecute
        key: node.kubernetes.io/unreachable
        operator: Exists
        tolerationSeconds: 120
      • effect: NoExecute
        key: node.kubernetes.io/not-ready
        operator: Exists
        tolerationSeconds: 120
      • effect: NoSchedule
        key: node.kubernetes.io/memory-pressure
        operator: Exists
        volumes:
      • name: srv-cert
        secret:
        defaultMode: 420
        secretName: catalog-operator-serving-cert
      • name: profile-collector-cert
        secret:
        defaultMode: 420
        secretName: pprof-cert
      • name: kube-api-access-cv642
        projected:
        defaultMode: 420
        sources:
      • serviceAccountToken:
        expirationSeconds: 3607
        path: token
      • configMap:
        items:
      • key: ca.crt
        path: ca.crt
        name: kube-root-ca.crt
      • downwardAPI:
        items:
      • fieldRef:
        apiVersion: v1
        fieldPath: metadata.namespace
        path: namespace
      • configMap:
        items:
      • key: service-ca.crt
        path: service-ca.crt
        name: openshift-service-ca.crt
        status:
        conditions:
      • lastProbeTime: null
        lastTransitionTime: "2022-06-29T02:46:23Z"
        status: "True"
        type: Initialized
      • lastProbeTime: null
        lastTransitionTime: "2022-06-29T08:11:56Z"
        status: "True"
        type: Ready
      • lastProbeTime: null
        lastTransitionTime: "2022-06-29T08:11:56Z"
        status: "True"
        type: ContainersReady
      • lastProbeTime: null
        lastTransitionTime: "2022-06-29T02:46:23Z"
        status: "True"
        type: PodScheduled
        containerStatuses:
      • containerID: cri-o://64a99805cca11e9af66452d739a91259566a35a9e580bfac81dd7205f9272e18
        image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:ac69cd6ac451019bc6060627a1cceb475d2b6b521803addb31b47e501d5e5935
        imageID: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:ac69cd6ac451019bc6060627a1cceb475d2b6b521803addb31b47e501d5e5935
        lastState:
        terminated:
        containerID: cri-o://938e2eb90465caf8ac99ff4405cfbb9a0f03b16598474f944cca44c3e70af9df
        exitCode: 2
        finishedAt: "2022-06-29T08:11:54Z"
        message: "ub.com/operator-framework/operator-lifecycle-manager/pkg/lib/kubestate/kubestate.go:128
        +0xc3 fp=0xc003c11180 sp=0xc003c11118 pc=0x1a36603\ngithub.com/operator-framework/operator-lifecycle-manager/pkg/controller/operators/catalog/subscription.(*subscriptionSyncer).Sync(0xc000a2f420,
        {0x237a238, 0xc0004ce4c0}, {0x236a448, 0xc004993ce0})\n\t/build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/controller/operators/catalog/subscription/syncer.go:77
        +0x535 fp=0xc003c11648 sp=0xc003c11180 pc=0x1a4f535\ngithub.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer.(*QueueInformer).Sync(...)\n\t/build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer/queueinformer.go:35\ngithub.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer.(*operator).processNextWorkItem(0xc0001e3ad0,
        {0x237a238, 0xc0004ce4c0}

        , 0xc000cee360)\n\t/build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer/queueinformer_operator.go:287
        +0x57c fp=0xc003c11f70 sp=0xc003c11648 pc=0x1a3ca7c\ngithub.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer.(*operator).worker(0x10000c0008fd6e0?,

        {0x237a238, 0xc0004ce4c0}

        , 0xc0004837b8?)\n\t/build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer/queueinformer_operator.go:231
        +0x45 fp=0xc003c11fb0 sp=0xc003c11f70 pc=0x1a3c4a5\ngithub.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer.(*operator).start.func3()\n\t/build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer/queueinformer_operator.go:221
        +0x32 fp=0xc003c11fe0 sp=0xc003c11fb0 pc=0x1a3c152\nruntime.goexit()\n\t/usr/lib/golang/src/runtime/asm_amd64.s:1571
        +0x1 fp=0xc003c11fe8 sp=0xc003c11fe0 pc=0x4719c1\ncreated by github.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer.(*operator).start\n\t/build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer/queueinformer_operator.go:221
        +0x557\n"
        reason: Error
        startedAt: "2022-06-29T07:56:16Z"
        name: catalog-operator
        ready: true
        restartCount: 6
        started: true
        state:
        running:
        startedAt: "2022-06-29T08:11:55Z"
        hostIP: 10.0.130.83
        phase: Running
        podIP: 10.130.0.121
        podIPs:

      • ip: 10.130.0.121
        qosClass: Burstable
        startTime: "2022-06-29T02:46:23Z"

      — Additional comment from jiazha@redhat.com on 2022-07-04 03:50:38 UTC —

      Please ignore comment 4, 5, they are nothing with this issue.

      — Additional comment from jiazha@redhat.com on 2022-07-04 06:57:24 UTC —

      Check the `previous` log.

      mac:~ jianzhang$ oc logs catalog-operator-f8ddcb57b-j5rf2 --previous
      time="2022-07-03T23:49:00Z" level=info msg="log level info"
      ...
      ...
      time="2022-07-04T03:43:25Z" level=info msg=syncing event=update reconciling="*v1alpha1.Subscription" selflink=
      time="2022-07-04T03:43:25Z" level=info msg=syncing event=update reconciling="*v1alpha1.Subscription" selflink=
      fatal error: concurrent map writes
      fatal error: concurrent map writes

      goroutine 559 [running]:
      runtime.throw(

      {0x2048546?, 0x0?}

      )
      /usr/lib/golang/src/runtime/panic.go:992 +0x71 fp=0xc001f9c508 sp=0xc001f9c4d8 pc=0x43e9f1
      runtime.mapassign_faststr(0x1d09880, 0xc0031847b0,

      {0x2079799, 0x2e}

      )
      /usr/lib/golang/src/runtime/map_faststr.go:295 +0x38b fp=0xc001f9c570 sp=0xc001f9c508 pc=0x419b4b
      github.com/operator-framework/operator-lifecycle-manager/pkg/controller/registry/reconciler.Pod(0xc001f4a900,

      {0x203dd0f, 0xf}

      ,

      {0xc00132ccc0, 0x38}

      ,

      {0xc003582d50, 0x13}, 0xc00452c1e0, 0xc0031847b0, 0x5, ...)
      /build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/controller/registry/reconciler/reconciler.go:227 +0xbdf fp=0xc001f9cbb0 sp=0xc001f9c570 pc=0x1a475ff
      github.com/operator-framework/operator-lifecycle-manager/pkg/controller/registry/reconciler.(*grpcCatalogSourceDecorator).Pod(0xc001f9ce50, {0xc003582d50, 0x13}

      )
      /build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/controller/registry/reconciler/grpc.go:125 +0xf9 fp=0xc001f9cc30 sp=0xc001f9cbb0 pc=0x1a42c99
      github.com/operator-framework/operator-lifecycle-manager/pkg/controller/registry/reconciler.(*GrpcRegistryReconciler).currentPodsWithCorrectImageAndSpec(0xc001f9ce68?,

      {0xc001f4a900}

      ,

      {0xc003582d50, 0x13}

      )
      /build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/controller/registry/reconciler/grpc.go:190 +0x198 fp=0xc001f9ce48 sp=0xc001f9cc30 pc=0x1a437b8
      github.com/operator-framework/operator-lifecycle-manager/pkg/controller/registry/reconciler.(*GrpcRegistryReconciler).CheckRegistryServer(0xc000bcbf80?, 0x493b77?)
      /build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/controller/registry/reconciler/grpc.go:453 +0x4c fp=0xc001f9ce88 sp=0xc001f9ce48 pc=0x1a45fcc
      github.com/operator-framework/operator-lifecycle-manager/pkg/controller/operators/catalog/subscription.(*catalogHealthReconciler).healthy(0x38ca8453?, 0xc001f4a900)
      /build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/controller/operators/catalog/subscription/reconciler.go:196 +0x7e fp=0xc001f9ced0 sp=0xc001f9ce88 pc=0x1a4ae1e
      github.com/operator-framework/operator-lifecycle-manager/pkg/controller/operators/catalog/subscription.(*catalogHealthReconciler).health(0x1bc37c0?, 0xc003e7e7e0, 0x8?)
      /build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/controller/operators/catalog/subscription/reconciler.go:159 +0x2a fp=0xc001f9cf10 sp=0xc001f9ced0 pc=0x1a4ac8a
      github.com/operator-framework/operator-lifecycle-manager/pkg/controller/operators/catalog/subscription.(*catalogHealthReconciler).catalogHealth(0xc000a59a90,

      {0xc003356a50, 0x11}

      )
      /build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/controller/operators/catalog/subscription/reconciler.go:137 +0x387 fp=0xc001f9d040 sp=0xc001f9cf10 pc=0x1a4a827
      github.com/operator-framework/operator-lifecycle-manager/pkg/controller/operators/catalog/subscription.(*catalogHealthReconciler).Reconcile(0xc000a59a90,

      {0x237a238, 0xc0003fe5c0}, {0x7f9f6e5b3900?, 0xc0050f64f0?})
      /build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/controller/operators/catalog/subscription/reconciler.go:82 +0x21e fp=0xc001f9d118 sp=0xc001f9d040 pc=0x1a4a1be
      github.com/operator-framework/operator-lifecycle-manager/pkg/lib/kubestate.ReconcilerChain.Reconcile({0xc0008e43c0?, 0x3, 0xc001f9d258?}, {0x237a238, 0xc0003fe5c0}

      ,

      {0x7f9f6e5b3328?, 0xc0050f6490?}

      )
      /build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/lib/kubestate/kubestate.go:128 +0xc3 fp=0xc001f9d180 sp=0xc001f9d118 pc=0x1a36603
      github.com/operator-framework/operator-lifecycle-manager/pkg/controller/operators/catalog/subscription.(*subscriptionSyncer).Sync(0xc0004dfd50,

      {0x237a238, 0xc0003fe5c0}, {0x236a448, 0xc003cbb760})
      /build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/controller/operators/catalog/subscription/syncer.go:77 +0x535 fp=0xc001f9d648 sp=0xc001f9d180 pc=0x1a4f535
      github.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer.(*QueueInformer).Sync(...)
      /build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer/queueinformer.go:35
      github.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer.(*operator).processNextWorkItem(0xc00057a580, {0x237a238, 0xc0003fe5c0}

      , 0xc000954720)
      /build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer/queueinformer_operator.go:287 +0x57c fp=0xc001f9df70 sp=0xc001f9d648 pc=0x1a3ca7c
      github.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer.(*operator).worker(0x0?,

      {0x237a238, 0xc0003fe5c0}

      , 0x0?)
      /build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer/queueinformer_operator.go:231 +0x45 fp=0xc001f9dfb0 sp=0xc001f9df70 pc=0x1a3c4a5
      github.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer.(*operator).start.func3()
      /build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer/queueinformer_operator.go:221 +0x32 fp=0xc001f9dfe0 sp=0xc001f9dfb0 pc=0x1a3c152
      runtime.goexit()
      /usr/lib/golang/src/runtime/asm_amd64.s:1571 +0x1 fp=0xc001f9dfe8 sp=0xc001f9dfe0 pc=0x4719c1
      created by github.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer.(*operator).start
      /build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer/queueinformer_operator.go:221 +0x557

      Seems like it failed at: https://github.com/operator-framework/operator-lifecycle-manager/blob/master/pkg/controller/registry/reconciler/reconciler.go#L227

      — Additional comment from agreene@redhat.com on 2022-07-05 16:01:22 UTC —

      As Jian pointed out, the catalog operator is failing due to a concurrent write at https://github.com/operator-framework/operator-lifecycle-manager/blob/master/pkg/controller/registry/reconciler/reconciler.go#L227.

      This is happening because:

      Line 227 in the reconciler.go directly mutates the catalogSource's annotations. The grpcCatalogSourceDecorator's Annotations function should be returning a copy of the annotations or it should be created with a deepcopy of the catalogSource to avoid mutating an object in the lister cache.

      This doesn't seem to be a blocker, but we should get a fix in swiftly.

      — Additional comment from jiazha@redhat.com on 2022-07-13 05:02:04 UTC —

      1, Create a cluster with the fixed PR via the Cluster-bot.
      mac:~ jianzhang$ oc get clusterversion
      NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
      version 4.11.0-0.ci.test-2022-07-13-022646-ci-ln-41fvni2-latest True False 126m Cluster version is 4.11.0-0.ci.test-2022-07-13-022646-ci-ln-41fvni2-latest

      2, Subscribe some operators.
      mac:~ jianzhang$ oc get sub -A
      NAMESPACE NAME PACKAGE SOURCE CHANNEL
      default etcd etcd community-operators singlenamespace-alpha
      openshift-logging cluster-logging cluster-logging redhat-operators stable
      openshift-operators-redhat elasticsearch-operator elasticsearch-operator redhat-operators stable

      mac:~ jianzhang$ oc get sub -A
      NAMESPACE NAME PACKAGE SOURCE CHANNEL
      default etcd etcd community-operators singlenamespace-alpha
      openshift-logging cluster-logging cluster-logging redhat-operators stable
      openshift-operators-redhat elasticsearch-operator elasticsearch-operator redhat-operators stable
      mac:~ jianzhang$
      mac:~ jianzhang$
      mac:~ jianzhang$ oc get csv -n openshift-operators-redhat
      NAME DISPLAY VERSION REPLACES PHASE
      elasticsearch-operator.5.4.2 OpenShift Elasticsearch Operator 5.4.2 Succeeded
      mac:~ jianzhang$ oc get csv -n openshift-logging
      NAME DISPLAY VERSION REPLACES PHASE
      cluster-logging.5.4.2 Red Hat OpenShift Logging 5.4.2 Succeeded
      elasticsearch-operator.5.4.2 OpenShift Elasticsearch Operator 5.4.2 Succeeded
      mac:~ jianzhang$ oc get csv -n default
      NAME DISPLAY VERSION REPLACES PHASE
      elasticsearch-operator.5.4.2 OpenShift Elasticsearch Operator 5.4.2 Succeeded
      etcdoperator.v0.9.4 etcd 0.9.4 etcdoperator.v0.9.2 Succeeded

      3, Check OLM catalog-operator pods status.
      mac:~ jianzhang$ oc get pods
      NAME READY STATUS RESTARTS AGE
      catalog-operator-546db7cdf5-7pldg 1/1 Running 0 145m
      collect-profiles-27628110-lr2nv 0/1 Completed 0 30m
      collect-profiles-27628125-br8b8 0/1 Completed 0 15m
      collect-profiles-27628140-m64gp 0/1 Completed 0 38s
      olm-operator-754d7f6f56-26qhw 1/1 Running 0 145m
      package-server-manager-77d5cbf696-v9w4p 1/1 Running 0 145m
      packageserver-6884994d98-2smtw 1/1 Running 0 143m
      packageserver-6884994d98-5d7jg 1/1 Running 0 143m

      mac:~ jianzhang$ oc logs catalog-operator-546db7cdf5-7pldg --previous
      Error from server (BadRequest): previous terminated container "catalog-operator" in pod "catalog-operator-546db7cdf5-7pldg" not found

      No terminated container. catalog-operator works well. Verify it.

      — Additional comment from aos-team-art-private@redhat.com on 2022-07-13 22:50:04 UTC —

      Elliott changed bug status from MODIFIED to ON_QA.
      This bug is expected to ship in the next 4.12 release.

      — Additional comment from jiazha@redhat.com on 2022-07-18 07:23:08 UTC —

      Changed the status to VERIFIED based on comment 10.

      Attachments

        Activity

          People

            pegoncal@redhat.com Per Goncalves da Silva
            openshift-crt-jira-prow OpenShift Prow Bot
            Jian Zhang Jian Zhang
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: