-
Bug
-
Resolution: Done
-
Undefined
-
4.10.z
This bug is a backport clone of [Bugzilla Bug 2117324](https://bugzilla.redhat.com/show_bug.cgi?id=2117324). The following is the description of the original bug:
—
+++ This bug was initially created as a clone of Bug #2101357 +++
Description of problem:
message: "her.go:105 +0xe5\ncreated by k8s.io/apimachinery/pkg/watch.NewStreamWatcher\n\t/build/vendor/k8s.io/apimachinery/pkg/watch/streamwatcher.go:76
+0x130\n\ngoroutine 5545 [select, 7 minutes]:\ngolang.org/x/net/http2.(*clientStream).writeRequest(0xc00240a780,
0xc003321a00)\n\t/build/vendor/golang.org/x/net/http2/transport.go:1345
+0x9c9\ngolang.org/x/net/http2.(*clientStream).doRequest(0xc002efea80?,
0xc0009cc7a0?)\n\t/build/vendor/golang.org/x/net/http2/transport.go:1207
+0x1e\ncreated by golang.org/x/net/http2.(*ClientConn).RoundTrip\n\t/build/vendor/golang.org/x/net/http2/transport.go:1136
+0x30a\n\ngoroutine 5678 [select, 3 minutes]:\ngolang.org/x/net/http2.(*clientStream).writeRequest(0xc000b70480,
0xc0035d4500)\n\t/build/vendor/golang.org/x/net/http2/transport.go:1345
+0x9c9\ngolang.org/x/net/http2.(*clientStream).doRequest(0x6e5326?, 0xc002999e90?)\n\t/build/vendor/golang.org/x/net/http2/transport.go:1207
+0x1e\ncreated by golang.org/x/net/http2.(*ClientConn).RoundTrip\n\t/build/vendor/golang.org/x/net/http2/transport.go:1136
+0x30a\n\ngoroutine 5836 [select, 1 minutes]:\ngolang.org/x/net/http2.(*clientStream).writeRequest(0xc003b00180,
0xc003ff8a00)\n\t/build/vendor/golang.org/x/net/http2/transport.go:1345
+0x9c9\ngolang.org/x/net/http2.(*clientStream).doRequest(0x6e5326?, 0xc003a1c8d0?)\n\t/build/vendor/golang.org/x/net/http2/transport.go:1207
+0x1e\ncreated by golang.org/x/net/http2.(*ClientConn).RoundTrip\n\t/build/vendor/golang.org/x/net/http2/transport.go:1136
+0x30a\n\ngoroutine 5905 [chan receive, 1 minutes]:\ngithub.com/operator-framework/operator-lifecycle-manager/pkg/controller/registry/resolver.(*sourceInvalidator).GetValidChannel.func1()\n\t/build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/controller/registry/resolver/source_registry.go:51
+0x85\ncreated by github.com/operator-framework/operator-lifecycle-manager/pkg/controller/registry/resolver.(*sourceInvalidator).GetValidChannel\n\t/build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/controller/registry/resolver/source_registry.go:50
+0x231\n"
reason: Error
startedAt: "2022-06-27T00:00:59Z"
Version-Release number of selected component (if applicable):
mac:~ jianzhang$ oc exec catalog-operator-66cb8fd8c5-j7vkx – olm --version
OLM version: 0.19.0
git commit: 8c2bd46147a90d58e98de73d34fd79477769f11f
mac:~ jianzhang$ oc get clusterversion
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.11.0-0.nightly-2022-06-25-081133 True False 10h Cluster version is 4.11.0-0.nightly-2022-06-25-081133
How reproducible:
always
Steps to Reproduce:
1. Install OCP 4.11
2. Check OLM pods
Actual results:
mac:~ jianzhang$ oc get pods
NAME READY STATUS RESTARTS AGE
catalog-operator-66cb8fd8c5-j7vkx 1/1 Running 2 (8h ago) 10h
collect-profiles-27605340-wgsvf 0/1 Completed 0 42m
collect-profiles-27605355-ffgxd 0/1 Completed 0 27m
collect-profiles-27605370-w7ds7 0/1 Completed 0 12m
olm-operator-6cfd444b8f-r5q4t 1/1 Running 0 10h
package-server-manager-66589d4bf8-csr7j 1/1 Running 0 10h
packageserver-59977db6cf-nkn5w 1/1 Running 0 10h
packageserver-59977db6cf-nxbnx 1/1 Running 0 10h
mac:~ jianzhang$ oc get pods catalog-operator-66cb8fd8c5-j7vkx -o yaml
apiVersion: v1
kind: Pod
metadata:
annotations:
k8s.v1.cni.cncf.io/network-status: |-
[{
"name": "openshift-sdn",
"interface": "eth0",
"ips": [
"10.130.0.26"
],
"default": true,
"dns": {}
}]
k8s.v1.cni.cncf.io/networks-status: |-
[{
"name": "openshift-sdn",
"interface": "eth0",
"ips": [
"10.130.0.26"
],
"default": true,
"dns": {}
}]
openshift.io/scc: nonroot-v2
seccomp.security.alpha.kubernetes.io/pod: runtime/default
creationTimestamp: "2022-06-26T23:12:45Z"
generateName: catalog-operator-66cb8fd8c5-
labels:
app: catalog-operator
pod-template-hash: 66cb8fd8c5
name: catalog-operator-66cb8fd8c5-j7vkx
namespace: openshift-operator-lifecycle-manager
ownerReferences:
- apiVersion: apps/v1
blockOwnerDeletion: true
controller: true
kind: ReplicaSet
name: catalog-operator-66cb8fd8c5
uid: bcf173be-97bc-4152-8cec-d45f820e167c
resourceVersion: "67395"
uid: bb34c37b-b22d-4412-bbc0-1e57a1b2bd3a
spec:
containers: - args:
- --namespace
- openshift-marketplace
- --configmapServerImage=quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:033bd57b54bb2e827b0ae95a03cb6f7ceeb65422e85de635185cfed88c17b2b4
- --opmImage=quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:033bd57b54bb2e827b0ae95a03cb6f7ceeb65422e85de635185cfed88c17b2b4
- --util-image
- quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:8d42dfe153a5bfa3a66e5e9d4664b4939aed376b4afbd9e098291c259daca2f8
- --writeStatusName
- operator-lifecycle-manager-catalog
- --tls-cert
- /srv-cert/tls.crt
- --tls-key
- /srv-cert/tls.key
- --client-ca
- /profile-collector-cert/tls.crt
command: - /bin/catalog
env: - name: RELEASE_VERSION
value: 4.11.0-0.nightly-2022-06-25-081133
image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:8d42dfe153a5bfa3a66e5e9d4664b4939aed376b4afbd9e098291c259daca2f8
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 3
httpGet:
path: /healthz
port: 8443
scheme: HTTPS
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
name: catalog-operator
ports: - containerPort: 8443
name: metrics
protocol: TCP
readinessProbe:
failureThreshold: 3
httpGet:
path: /healthz
port: 8443
scheme: HTTPS
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
resources:
requests:
cpu: 10m
memory: 80Mi
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop: - ALL
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: FallbackToLogsOnError
volumeMounts: - mountPath: /srv-cert
name: srv-cert
readOnly: true - mountPath: /profile-collector-cert
name: profile-collector-cert
readOnly: true - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: kube-api-access-7txv8
readOnly: true
dnsPolicy: ClusterFirst
enableServiceLinks: true
nodeName: ip-10-0-173-128.ap-southeast-1.compute.internal
nodeSelector:
kubernetes.io/os: linux
node-role.kubernetes.io/master: ""
preemptionPolicy: PreemptLowerPriority
priority: 2000000000
priorityClassName: system-cluster-critical
restartPolicy: Always
schedulerName: default-scheduler
securityContext:
runAsNonRoot: true
runAsUser: 65534
seLinuxOptions:
level: s0:c20,c0
seccompProfile:
type: RuntimeDefault
serviceAccount: olm-operator-serviceaccount
serviceAccountName: olm-operator-serviceaccount
terminationGracePeriodSeconds: 30
tolerations: - effect: NoSchedule
key: node-role.kubernetes.io/master
operator: Exists - effect: NoExecute
key: node.kubernetes.io/unreachable
operator: Exists
tolerationSeconds: 120 - effect: NoExecute
key: node.kubernetes.io/not-ready
operator: Exists
tolerationSeconds: 120 - effect: NoSchedule
key: node.kubernetes.io/memory-pressure
operator: Exists
volumes: - name: srv-cert
secret:
defaultMode: 420
secretName: catalog-operator-serving-cert - name: profile-collector-cert
secret:
defaultMode: 420
secretName: pprof-cert - name: kube-api-access-7txv8
projected:
defaultMode: 420
sources: - serviceAccountToken:
expirationSeconds: 3607
path: token - configMap:
items: - key: ca.crt
path: ca.crt
name: kube-root-ca.crt - downwardAPI:
items: - fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
path: namespace - configMap:
items: - key: service-ca.crt
path: service-ca.crt
name: openshift-service-ca.crt
status:
conditions: - lastProbeTime: null
lastTransitionTime: "2022-06-26T23:16:59Z"
status: "True"
type: Initialized - lastProbeTime: null
lastTransitionTime: "2022-06-27T01:19:09Z"
status: "True"
type: Ready - lastProbeTime: null
lastTransitionTime: "2022-06-27T01:19:09Z"
status: "True"
type: ContainersReady - lastProbeTime: null
lastTransitionTime: "2022-06-26T23:16:59Z"
status: "True"
type: PodScheduled
containerStatuses: - containerID: cri-o://f3fce12480556b5ab279236c290acdf15e1bc850426078730dcfd333ecda6795
image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:8d42dfe153a5bfa3a66e5e9d4664b4939aed376b4afbd9e098291c259daca2f8
imageID: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:8d42dfe153a5bfa3a66e5e9d4664b4939aed376b4afbd9e098291c259daca2f8
lastState:
terminated:
containerID: cri-o://5e3ce04f1cf538f679af008dcb62f9477131aae1772f527d6207fdc7bb247a55
exitCode: 2
finishedAt: "2022-06-27T01:19:08Z"
message: "her.go:105 +0xe5\ncreated by k8s.io/apimachinery/pkg/watch.NewStreamWatcher\n\t/build/vendor/k8s.io/apimachinery/pkg/watch/streamwatcher.go:76
+0x130\n\ngoroutine 5545 [select, 7 minutes]:\ngolang.org/x/net/http2.(*clientStream).writeRequest(0xc00240a780,
0xc003321a00)\n\t/build/vendor/golang.org/x/net/http2/transport.go:1345
+0x9c9\ngolang.org/x/net/http2.(*clientStream).doRequest(0xc002efea80?,
0xc0009cc7a0?)\n\t/build/vendor/golang.org/x/net/http2/transport.go:1207
+0x1e\ncreated by golang.org/x/net/http2.(*ClientConn).RoundTrip\n\t/build/vendor/golang.org/x/net/http2/transport.go:1136
+0x30a\n\ngoroutine 5678 [select, 3 minutes]:\ngolang.org/x/net/http2.(*clientStream).writeRequest(0xc000b70480,
0xc0035d4500)\n\t/build/vendor/golang.org/x/net/http2/transport.go:1345
+0x9c9\ngolang.org/x/net/http2.(*clientStream).doRequest(0x6e5326?, 0xc002999e90?)\n\t/build/vendor/golang.org/x/net/http2/transport.go:1207
+0x1e\ncreated by golang.org/x/net/http2.(*ClientConn).RoundTrip\n\t/build/vendor/golang.org/x/net/http2/transport.go:1136
+0x30a\n\ngoroutine 5836 [select, 1 minutes]:\ngolang.org/x/net/http2.(*clientStream).writeRequest(0xc003b00180,
0xc003ff8a00)\n\t/build/vendor/golang.org/x/net/http2/transport.go:1345
+0x9c9\ngolang.org/x/net/http2.(*clientStream).doRequest(0x6e5326?, 0xc003a1c8d0?)\n\t/build/vendor/golang.org/x/net/http2/transport.go:1207
+0x1e\ncreated by golang.org/x/net/http2.(*ClientConn).RoundTrip\n\t/build/vendor/golang.org/x/net/http2/transport.go:1136
+0x30a\n\ngoroutine 5905 [chan receive, 1 minutes]:\ngithub.com/operator-framework/operator-lifecycle-manager/pkg/controller/registry/resolver.(*sourceInvalidator).GetValidChannel.func1()\n\t/build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/controller/registry/resolver/source_registry.go:51
+0x85\ncreated by github.com/operator-framework/operator-lifecycle-manager/pkg/controller/registry/resolver.(*sourceInvalidator).GetValidChannel\n\t/build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/controller/registry/resolver/source_registry.go:50
+0x231\n"
reason: Error
startedAt: "2022-06-27T00:00:59Z"
name: catalog-operator
ready: true
restartCount: 2
started: true
state:
running:
startedAt: "2022-06-27T01:19:08Z"
hostIP: 10.0.173.128
phase: Running
podIP: 10.130.0.26
podIPs: - ip: 10.130.0.26
qosClass: Burstable
startTime: "2022-06-26T23:16:59Z"
Expected results:
catalog-operator works well.
Additional info:
Operators can be subscribed successfully.
mac:~ jianzhang$ oc get sub -A
NAMESPACE NAME PACKAGE SOURCE CHANNEL
jian learn learn qe-app-registry beta
openshift-logging cluster-logging cluster-logging qe-app-registry stable
openshift-operators-redhat elasticsearch-operator elasticsearch-operator qe-app-registry stable
mac:~ jianzhang$
mac:~ jianzhang$ oc get pods -n jian
NAME READY STATUS RESTARTS AGE
552b4660850a7fe1e1f142091eb5e4305f18af151727c56f70aa5dffc1dg8cg 0/1 Completed 0 54m
learn-operator-666b687bfb-7qppm 1/1 Running 0 54m
qe-app-registry-hbzxg 1/1 Running 0 58m
mac:~ jianzhang$ oc get csv -n jian
NAME DISPLAY VERSION REPLACES PHASE
elasticsearch-operator.v5.5.0 OpenShift Elasticsearch Operator 5.5.0 Succeeded
learn-operator.v0.0.3 Learn Operator 0.0.3 learn-operator.v0.0.2 Succeeded
— Additional comment from jiazha@redhat.com on 2022-06-27 09:58:18 UTC —
Created attachment 1892927
olm must-gather
— Additional comment from jiazha@redhat.com on 2022-06-27 09:59:01 UTC —
Created attachment 1892928
marketplace project must-gather
— Additional comment from jiazha@redhat.com on 2022-06-28 02:05:39 UTC —
mac:~ jianzhang$ oc get clusterversion
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.11.0-0.nightly-2022-06-25-132614 True False 145m Cluster version is 4.11.0-0.nightly-2022-06-25-132614
mac:~ jianzhang$ oc get pods
NAME READY STATUS RESTARTS AGE
catalog-operator-869fb4bd4d-lbhgj 1/1 Running 3 (9m25s ago) 170m
collect-profiles-27606330-4wg5r 0/1 Completed 0 33m
collect-profiles-27606345-lmk4q 0/1 Completed 0 18m
collect-profiles-27606360-mksv6 0/1 Completed 0 3m17s
olm-operator-5f485d9d5f-wczjc 1/1 Running 0 170m
package-server-manager-6cf996b4cc-79lrw 1/1 Running 2 (156m ago) 170m
packageserver-5f668f98d7-2vjdn 1/1 Running 0 165m
packageserver-5f668f98d7-mb2wc 1/1 Running 0 165m
mac:~ jianzhang$ oc get pods catalog-operator-869fb4bd4d-lbhgj -o yaml
apiVersion: v1
kind: Pod
metadata:
annotations:
k8s.v1.cni.cncf.io/network-status: |-
[{
"name": "openshift-sdn",
"interface": "eth0",
"ips": [
"10.130.0.34"
],
"default": true,
"dns": {}
}]
k8s.v1.cni.cncf.io/networks-status: |-
[{
"name": "openshift-sdn",
"interface": "eth0",
"ips": [
"10.130.0.34"
],
"default": true,
"dns": {}
}]
openshift.io/scc: nonroot-v2
seccomp.security.alpha.kubernetes.io/pod: runtime/default
creationTimestamp: "2022-06-27T23:13:12Z"
generateName: catalog-operator-869fb4bd4d-
labels:
app: catalog-operator
pod-template-hash: 869fb4bd4d
name: catalog-operator-869fb4bd4d-lbhgj
namespace: openshift-operator-lifecycle-manager
ownerReferences:
- apiVersion: apps/v1
blockOwnerDeletion: true
controller: true
kind: ReplicaSet
name: catalog-operator-869fb4bd4d
uid: 3a1a8cd3-2151-4650-a96b-c1951d461c67
resourceVersion: "75671"
uid: 2f06d663-0697-4e88-9e9a-f802c6641efe
spec:
containers: - args:
- --namespace
- openshift-marketplace
- --configmapServerImage=quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:40552deca32531d2fe754f703613f24b514e27ffaa660b57d760f9cd984d9000
- --opmImage=quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:40552deca32531d2fe754f703613f24b514e27ffaa660b57d760f9cd984d9000
- --util-image
- quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:ac69cd6ac451019bc6060627a1cceb475d2b6b521803addb31b47e501d5e5935
- --writeStatusName
- operator-lifecycle-manager-catalog
- --tls-cert
- /srv-cert/tls.crt
- --tls-key
- /srv-cert/tls.key
- --client-ca
- /profile-collector-cert/tls.crt
command: - /bin/catalog
env: - name: RELEASE_VERSION
value: 4.11.0-0.nightly-2022-06-25-132614
image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:ac69cd6ac451019bc6060627a1cceb475d2b6b521803addb31b47e501d5e5935
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 3
httpGet:
path: /healthz
port: 8443
scheme: HTTPS
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
name: catalog-operator
ports: - containerPort: 8443
name: metrics
protocol: TCP
readinessProbe:
failureThreshold: 3
httpGet:
path: /healthz
port: 8443
scheme: HTTPS
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
resources:
requests:
cpu: 10m
memory: 80Mi
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop: - ALL
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: FallbackToLogsOnError
volumeMounts: - mountPath: /srv-cert
name: srv-cert
readOnly: true - mountPath: /profile-collector-cert
name: profile-collector-cert
readOnly: true - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: kube-api-access-sdjns
readOnly: true
dnsPolicy: ClusterFirst
enableServiceLinks: true
nodeName: ip-10-0-190-130.ap-south-1.compute.internal
nodeSelector:
kubernetes.io/os: linux
node-role.kubernetes.io/master: ""
preemptionPolicy: PreemptLowerPriority
priority: 2000000000
priorityClassName: system-cluster-critical
restartPolicy: Always
schedulerName: default-scheduler
securityContext:
runAsNonRoot: true
runAsUser: 65534
seLinuxOptions:
level: s0:c20,c0
seccompProfile:
type: RuntimeDefault
serviceAccount: olm-operator-serviceaccount
serviceAccountName: olm-operator-serviceaccount
terminationGracePeriodSeconds: 30
tolerations: - effect: NoSchedule
key: node-role.kubernetes.io/master
operator: Exists - effect: NoExecute
key: node.kubernetes.io/unreachable
operator: Exists
tolerationSeconds: 120 - effect: NoExecute
key: node.kubernetes.io/not-ready
operator: Exists
tolerationSeconds: 120 - effect: NoSchedule
key: node.kubernetes.io/memory-pressure
operator: Exists
volumes: - name: srv-cert
secret:
defaultMode: 420
secretName: catalog-operator-serving-cert - name: profile-collector-cert
secret:
defaultMode: 420
secretName: pprof-cert - name: kube-api-access-sdjns
projected:
defaultMode: 420
sources: - serviceAccountToken:
expirationSeconds: 3607
path: token - configMap:
items: - key: ca.crt
path: ca.crt
name: kube-root-ca.crt - downwardAPI:
items: - fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
path: namespace - configMap:
items: - key: service-ca.crt
path: service-ca.crt
name: openshift-service-ca.crt
status:
conditions: - lastProbeTime: null
lastTransitionTime: "2022-06-27T23:16:47Z"
status: "True"
type: Initialized - lastProbeTime: null
lastTransitionTime: "2022-06-28T01:53:53Z"
status: "True"
type: Ready - lastProbeTime: null
lastTransitionTime: "2022-06-28T01:53:53Z"
status: "True"
type: ContainersReady - lastProbeTime: null
lastTransitionTime: "2022-06-27T23:16:47Z"
status: "True"
type: PodScheduled
containerStatuses: - containerID: cri-o://2a83d3ba503aa760d27aeef28bf03af84b9ff134ea76c6545251f3253507c22b
image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:ac69cd6ac451019bc6060627a1cceb475d2b6b521803addb31b47e501d5e5935
imageID: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:ac69cd6ac451019bc6060627a1cceb475d2b6b521803addb31b47e501d5e5935
lastState:
terminated:
containerID: cri-o://2766bb3bad38c02f531915aac0321ffb18c7a993a7d84b27e8c7199935f7c858
exitCode: 2
finishedAt: "2022-06-28T01:53:52Z"
message: "izer/streaming/streaming.go:77 +0xa7\nk8s.io/client-go/rest/watch.(*Decoder).Decode(0xc002b12500)\n\t/build/vendor/k8s.io/client-go/rest/watch/decoder.go:49
+0x4f\nk8s.io/apimachinery/pkg/watch.(*StreamWatcher).receive(0xc0036178c0)\n\t/build/vendor/k8s.io/apimachinery/pkg/watch/streamwatcher.go:105
+0xe5\ncreated by k8s.io/apimachinery/pkg/watch.NewStreamWatcher\n\t/build/vendor/k8s.io/apimachinery/pkg/watch/streamwatcher.go:76
+0x130\n\ngoroutine 3479 [sync.Cond.Wait]:\nsync.runtime_notifyListWait(0xc003a987c8,
0x8)\n\t/usr/lib/golang/src/runtime/sema.go:513 +0x13d\nsync.(*Cond).Wait(0xc001f9bf10?)\n\t/usr/lib/golang/src/sync/cond.go:56
+0x8c\ngolang.org/x/net/http2.(*pipe).Read(0xc003a987b0, {0xc0036e4001, 0xdff, 0xdff})\n\t/build/vendor/golang.org/x/net/http2/pipe.go:76 +0xeb\ngolang.org/x/net/http2.transportResponseBody.Read(
{0x0?},
{0xc0036e4001?, 0x2?, 0x203511d?}
)\n\t/build/vendor/golang.org/x/net/http2/transport.go:2407
{0x1d377c0, 0xc003523098}
+0x85\nencoding/json.(*Decoder).refill(0xc002fc0640)\n\t/usr/lib/golang/src/encoding/json/stream.go:165
+0x17f\nencoding/json.(*Decoder).readValue(0xc002fc0640)\n\t/usr/lib/golang/src/encoding/json/stream.go:140
+0xbb\nencoding/json.(*Decoder).Decode(0xc002fc0640,)\n\t/usr/lib/golang/src/encoding/json/stream.go:63
{0xc0036dd500, 0x1000, 0x1500}
+0x78\nk8s.io/apimachinery/pkg/util/framer.(*jsonFrameReader).Read(0xc003127770,
)\n\t/build/vendor/k8s.io/apimachinery/pkg/util/framer/framer.go:152
{0x2366870, 0xc0044dca80}
+0x19c\nk8s.io/apimachinery/pkg/runtime/serializer/streaming.(*decoder).Decode(0xc003502aa0,
0xc001f9bf10?,)\n\t/build/vendor/k8s.io/apimachinery/pkg/runtime/serializer/streaming/streaming.go:77
+0xa7\nk8s.io/client-go/rest/watch.(*Decoder).Decode(0xc00059f700)\n\t/build/vendor/k8s.io/client-go/rest/watch/decoder.go:49
+0x4f\nk8s.io/apimachinery/pkg/watch.(*StreamWatcher).receive(0xc0044dcd40)\n\t/build/vendor/k8s.io/apimachinery/pkg/watch/streamwatcher.go:105
+0xe5\ncreated by k8s.io/apimachinery/pkg/watch.NewStreamWatcher\n\t/build/vendor/k8s.io/apimachinery/pkg/watch/streamwatcher.go:76
+0x130\n"
reason: Error
startedAt: "2022-06-28T01:06:59Z"
name: catalog-operator
ready: true
restartCount: 3
started: true
state:
running:
startedAt: "2022-06-28T01:53:53Z"
hostIP: 10.0.190.130
phase: Running
podIP: 10.130.0.34
podIPs: - ip: 10.130.0.34
qosClass: Burstable
startTime: "2022-06-27T23:16:47Z"
— Additional comment from jiazha@redhat.com on 2022-06-28 02:09:23 UTC —
mac:~ jianzhang$ oc get pods package-server-manager-6cf996b4cc-79lrw -o yaml
apiVersion: v1
kind: Pod
metadata:
annotations:
k8s.v1.cni.cncf.io/network-status: |-
[{
"name": "openshift-sdn",
"interface": "eth0",
"ips": [
"10.130.0.13"
],
"default": true,
"dns": {}
}]
k8s.v1.cni.cncf.io/networks-status: |-
[{
"name": "openshift-sdn",
"interface": "eth0",
"ips": [
"10.130.0.13"
],
"default": true,
"dns": {}
}]
openshift.io/scc: nonroot-v2
seccomp.security.alpha.kubernetes.io/pod: runtime/default
creationTimestamp: "2022-06-27T23:13:10Z"
generateName: package-server-manager-6cf996b4cc-
labels:
app: package-server-manager
pod-template-hash: 6cf996b4cc
name: package-server-manager-6cf996b4cc-79lrw
namespace: openshift-operator-lifecycle-manager
ownerReferences:
- apiVersion: apps/v1
blockOwnerDeletion: true
controller: true
kind: ReplicaSet
name: package-server-manager-6cf996b4cc
uid: 191cb627-a1f3-452b-aed2-336de7871004
resourceVersion: "19056"
uid: 63cb7200-32d6-4b0d-bfb2-ba725cda7fbd
spec:
containers: - args:
- --name
- $(PACKAGESERVER_NAME)
- --namespace
- $(PACKAGESERVER_NAMESPACE)
command: - /bin/psm
- start
env: - name: PACKAGESERVER_NAME
value: packageserver - name: PACKAGESERVER_IMAGE
value: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:ac69cd6ac451019bc6060627a1cceb475d2b6b521803addb31b47e501d5e5935 - name: PACKAGESERVER_NAMESPACE
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.namespace - name: RELEASE_VERSION
value: 4.11.0-0.nightly-2022-06-25-132614
image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:ac69cd6ac451019bc6060627a1cceb475d2b6b521803addb31b47e501d5e5935
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 3
httpGet:
path: /healthz
port: 8080
scheme: HTTP
initialDelaySeconds: 30
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
name: package-server-manager
readinessProbe:
failureThreshold: 3
httpGet:
path: /healthz
port: 8080
scheme: HTTP
initialDelaySeconds: 30
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
resources:
requests:
cpu: 10m
memory: 50Mi
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop: - ALL
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: FallbackToLogsOnError
volumeMounts: - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: kube-api-access-k9hh2
readOnly: true
dnsPolicy: ClusterFirst
enableServiceLinks: true
nodeName: ip-10-0-190-130.ap-south-1.compute.internal
nodeSelector:
kubernetes.io/os: linux
node-role.kubern
— Additional comment from jiazha@redhat.com on 2022-06-28 02:10:02 UTC —
preemptionPolicy: PreemptLowerPriority
priority: 2000000000
priorityClassName: system-cluster-critical
restartPolicy: Always
schedulerName: default-scheduler
securityContext:
runAsNonRoot: true
runAsUser: 65534
seLinuxOptions:
level: s0:c20,c0
seccompProfile:
type: RuntimeDefault
serviceAccount: olm-operator-serviceaccount
serviceAccountName: olm-operator-serviceaccount
terminationGracePeriodSeconds: 30
tolerations:
- effect: NoSchedule
key: node-role.kubernetes.io/master
operator: Exists - effect: NoExecute
key: node.kubernetes.io/unreachable
operator: Exists
tolerationSeconds: 120 - effect: NoExecute
key: node.kubernetes.io/not-ready
operator: Exists
tolerationSeconds: 120 - effect: NoSchedule
key: node.kubernetes.io/memory-pressure
operator: Exists
volumes: - name: kube-api-access-k9hh2
projected:
defaultMode: 420
sources: - serviceAccountToken:
expirationSeconds: 3607
path: token - configMap:
items: - key: ca.crt
path: ca.crt
name: kube-root-ca.crt - downwardAPI:
items: - fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
path: namespace - configMap:
items: - key: service-ca.crt
path: service-ca.crt
name: openshift-service-ca.crt
status:
conditions: - lastProbeTime: null
lastTransitionTime: "2022-06-27T23:16:47Z"
status: "True"
type: Initialized - lastProbeTime: null
lastTransitionTime: "2022-06-27T23:27:28Z"
status: "True"
type: Ready - lastProbeTime: null
lastTransitionTime: "2022-06-27T23:27:28Z"
status: "True"
type: ContainersReady - lastProbeTime: null
lastTransitionTime: "2022-06-27T23:16:47Z"
status: "True"
type: PodScheduled
containerStatuses: - containerID: cri-o://2c56089fabbaa6d7067feb231750ad20422e299dc33dabc0cd19ceca5951ed3a
image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:ac69cd6ac451019bc6060627a1cceb475d2b6b521803addb31b47e501d5e5935
imageID: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:ac69cd6ac451019bc6060627a1cceb475d2b6b521803addb31b47e501d5e5935
lastState:
terminated:
containerID: cri-o://76cc5448de346ec636070a71868432c8f3aa213e16fe211c8fe18a3fee112d23
exitCode: 1
finishedAt: "2022-06-27T23:26:36Z"
message: "d/vendor/github.com/spf13/cobra/command.go:856\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\t/build/vendor/github.com/spf13/cobra/command.go:974\ngithub.com/spf13/cobra.(*Command).Execute\n\t/build/vendor/github.com/spf13/cobra/command.go:902\nmain.main\n\t/build/cmd/package-server-manager/main.go:36\nruntime.main\n\t/usr/lib/golang/src/runtime/proc.go:250\n1.6563723963629913e+09\tERROR\tFailed
to get API Group-Resources\t {\"error\": \"Get \\\"https://172.30.0.1:443/api?timeout=32s\\\": dial tcp 172.30.0.1:443: connect: connection refused\"}\nsigs.k8s.io/controller-runtime/pkg/cluster.New\n\t/build/vendor/sigs.k8s.io/controller-runtime/pkg/cluster/cluster.go:160\nsigs.k8s.io/controller-runtime/pkg/manager.New\n\t/build/vendor/sigs.k8s.io/controller-runtime/pkg/manager/manager.go:322\nmain.run\n\t/build/cmd/package-server-manager/main.go:67\ngithub.com/spf13/cobra.(*Command).execute\n\t/build/vendor/github.com/spf13/cobra/command.go:856\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\t/build/vendor/github.com/spf13/cobra/command.go:974\ngithub.com/spf13/cobra.(*Command).Execute\n\t/build/vendor/github.com/spf13/cobra/command.go:902\nmain.main\n\t/build/cmd/package-server-manager/main.go:36\nruntime.main\n\t/usr/lib/golang/src/runtime/proc.go:250\n1.6563723963631017e+09\tERROR\tsetup\tfailed
{\"error\": \"Get \\\"https://172.30.0.1:443/api?timeout=32s\\\": dial tcp 172.30.0.1:443: connect: connection refused\"}
to setup manager instance\t\ngithub.com/spf13/cobra.(*Command).execute\n\t/build/vendor/github.com/spf13/cobra/command.go:856\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\t/build/vendor/github.com/spf13/cobra/command.go:974\ngithub.com/spf13/cobra.(*Command).Execute\n\t/build/vendor/github.com/spf13/cobra/command.go:902\nmain.main\n\t/build/cmd/package-server-manager/main.go:36\nruntime.main\n\t/usr/lib/golang/src/runtime/proc.go:250\nError:
Get \"https://172.30.0.1:443/api?timeout=32s\": dial tcp 172.30.0.1:443:
connect: connection refused\nencountered an error while executing the binary:
Get \"https://172.30.0.1:443/api?timeout=32s\": dial tcp 172.30.0.1:443:
connect: connection refused\n"
reason: Error
startedAt: "2022-06-27T23:26:11Z"
name: package-server-manager
ready: true
restartCount: 2
started: true
state:
running:
startedAt: "2022-06-27T23:26:54Z"
hostIP: 10.0.190.130
phase: Running
podIP: 10.130.0.13
podIPs: - ip: 10.130.0.13
qosClass: Burstable
startTime: "2022-06-27T23:16:47Z"
— Additional comment from jiazha@redhat.com on 2022-06-29 08:43:51 UTC —
Observed the error restarts:
mac:~ jianzhang$ oc get clusterversion
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.11.0-0.nightly-2022-06-28-160049 True False 5h57m Cluster version is 4.11.0-0.nightly-2022-06-28-160049
mac:~ jianzhang$ oc get pods
NAME READY STATUS RESTARTS AGE
catalog-operator-7b88dddfbc-rsfhz 1/1 Running 6 (26m ago) 5h51m
collect-profiles-27608160-6m7r6 0/1 Completed 0 37m
collect-profiles-27608175-94n56 0/1 Completed 0 22m
collect-profiles-27608190-nbzcf 0/1 Completed 0 7m55s
olm-operator-5977ffb855-lgfn8 1/1 Running 0 9h
package-server-manager-75db6dcfc-hql4v 1/1 Running 0 9h
packageserver-5955fb79cd-9n56n 1/1 Running 0 9h
packageserver-5955fb79cd-xf6f6 1/1 Running 0 9h
mac:~ jianzhang$ oc get pods catalog-operator-7b88dddfbc-rsfhz -o yaml
apiVersion: v1
kind: Pod
metadata:
annotations:
k8s.v1.cni.cncf.io/network-status: |-
[{
"name": "openshift-sdn",
"interface": "eth0",
"ips": [
"10.130.0.121"
],
"default": true,
"dns": {}
}]
k8s.v1.cni.cncf.io/networks-status: |-
[{
"name": "openshift-sdn",
"interface": "eth0",
"ips": [
"10.130.0.121"
],
"default": true,
"dns": {}
}]
openshift.io/scc: nonroot-v2
seccomp.security.alpha.kubernetes.io/pod: runtime/default
creationTimestamp: "2022-06-29T02:46:23Z"
generateName: catalog-operator-7b88dddfbc-
labels:
app: catalog-operator
pod-template-hash: 7b88dddfbc
name: catalog-operator-7b88dddfbc-rsfhz
namespace: openshift-operator-lifecycle-manager
ownerReferences:
- apiVersion: apps/v1
blockOwnerDeletion: true
controller: true
kind: ReplicaSet
name: catalog-operator-7b88dddfbc
uid: 4c796f4d-b2f2-4e0d-b2de-6e8ab5f45ce2
resourceVersion: "278732"
uid: 10e8c6e9-8f1c-4484-b934-c5208016c426
spec:
containers: - args:
- --debug
- --namespace
- openshift-marketplace
- --configmapServerImage=quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:40552deca32531d2fe754f703613f24b514e27ffaa660b57d760f9cd984d9000
- --opmImage=quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:40552deca32531d2fe754f703613f24b514e27ffaa660b57d760f9cd984d9000
- --util-image
- quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:ac69cd6ac451019bc6060627a1cceb475d2b6b521803addb31b47e501d5e5935
- --writeStatusName
- operator-lifecycle-manager-catalog
- --tls-cert
- /srv-cert/tls.crt
- --tls-key
- /srv-cert/tls.key
- --client-ca
- /profile-collector-cert/tls.crt
command: - /bin/catalog
env: - name: RELEASE_VERSION
value: 4.11.0-0.nightly-2022-06-28-160049
image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:ac69cd6ac451019bc6060627a1cceb475d2b6b521803addb31b47e501d5e5935
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 3
httpGet:
path: /healthz
port: 8443
scheme: HTTPS
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
name: catalog-operator
ports: - containerPort: 8443
name: metrics
protocol: TCP
readinessProbe:
failureThreshold: 3
httpGet:
path: /healthz
port: 8443
scheme: HTTPS
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
resources:
requests:
cpu: 10m
memory: 80Mi
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop: - ALL
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: FallbackToLogsOnError
volumeMounts: - mountPath: /srv-cert
name: srv-cert
readOnly: true - mountPath: /profile-collector-cert
name: profile-collector-cert
readOnly: true - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: kube-api-access-cv642
readOnly: true
dnsPolicy: ClusterFirst
enableServiceLinks: true
imagePullSecrets: - name: olm-operator-serviceaccount-dockercfg-f26bf
nodeName: ip-10-0-130-83.ap-south-1.compute.internal
nodeSelector:
kubernetes.io/os: linux
node-role.kubernetes.io/master: ""
preemptionPolicy: PreemptLowerPriority
priority: 2000000000
priorityClassName: system-cluster-critical
restartPolicy: Always
schedulerName: default-scheduler
securityContext:
runAsNonRoot: true
runAsUser: 65534
seLinuxOptions:
level: s0:c20,c0
seccompProfile:
type: RuntimeDefault
serviceAccount: olm-operator-serviceaccount
serviceAccountName: olm-operator-serviceaccount
terminationGracePeriodSeconds: 30
tolerations: - effect: NoSchedule
key: node-role.kubernetes.io/master
operator: Exists - effect: NoExecute
key: node.kubernetes.io/unreachable
operator: Exists
tolerationSeconds: 120 - effect: NoExecute
key: node.kubernetes.io/not-ready
operator: Exists
tolerationSeconds: 120 - effect: NoSchedule
key: node.kubernetes.io/memory-pressure
operator: Exists
volumes: - name: srv-cert
secret:
defaultMode: 420
secretName: catalog-operator-serving-cert - name: profile-collector-cert
secret:
defaultMode: 420
secretName: pprof-cert - name: kube-api-access-cv642
projected:
defaultMode: 420
sources: - serviceAccountToken:
expirationSeconds: 3607
path: token - configMap:
items: - key: ca.crt
path: ca.crt
name: kube-root-ca.crt - downwardAPI:
items: - fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
path: namespace - configMap:
items: - key: service-ca.crt
path: service-ca.crt
name: openshift-service-ca.crt
status:
conditions: - lastProbeTime: null
lastTransitionTime: "2022-06-29T02:46:23Z"
status: "True"
type: Initialized - lastProbeTime: null
lastTransitionTime: "2022-06-29T08:11:56Z"
status: "True"
type: Ready - lastProbeTime: null
lastTransitionTime: "2022-06-29T08:11:56Z"
status: "True"
type: ContainersReady - lastProbeTime: null
lastTransitionTime: "2022-06-29T02:46:23Z"
status: "True"
type: PodScheduled
containerStatuses: - containerID: cri-o://64a99805cca11e9af66452d739a91259566a35a9e580bfac81dd7205f9272e18
image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:ac69cd6ac451019bc6060627a1cceb475d2b6b521803addb31b47e501d5e5935
imageID: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:ac69cd6ac451019bc6060627a1cceb475d2b6b521803addb31b47e501d5e5935
lastState:
terminated:
containerID: cri-o://938e2eb90465caf8ac99ff4405cfbb9a0f03b16598474f944cca44c3e70af9df
exitCode: 2
finishedAt: "2022-06-29T08:11:54Z"
message: "ub.com/operator-framework/operator-lifecycle-manager/pkg/lib/kubestate/kubestate.go:128
+0xc3 fp=0xc003c11180 sp=0xc003c11118 pc=0x1a36603\ngithub.com/operator-framework/operator-lifecycle-manager/pkg/controller/operators/catalog/subscription.(*subscriptionSyncer).Sync(0xc000a2f420,
{0x237a238, 0xc0004ce4c0}, {0x236a448, 0xc004993ce0})\n\t/build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/controller/operators/catalog/subscription/syncer.go:77
+0x535 fp=0xc003c11648 sp=0xc003c11180 pc=0x1a4f535\ngithub.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer.(*QueueInformer).Sync(...)\n\t/build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer/queueinformer.go:35\ngithub.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer.(*operator).processNextWorkItem(0xc0001e3ad0,
{0x237a238, 0xc0004ce4c0}, 0xc000cee360)\n\t/build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer/queueinformer_operator.go:287
{0x237a238, 0xc0004ce4c0}
+0x57c fp=0xc003c11f70 sp=0xc003c11648 pc=0x1a3ca7c\ngithub.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer.(*operator).worker(0x10000c0008fd6e0?,
, 0xc0004837b8?)\n\t/build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer/queueinformer_operator.go:231
+0x45 fp=0xc003c11fb0 sp=0xc003c11f70 pc=0x1a3c4a5\ngithub.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer.(*operator).start.func3()\n\t/build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer/queueinformer_operator.go:221
+0x32 fp=0xc003c11fe0 sp=0xc003c11fb0 pc=0x1a3c152\nruntime.goexit()\n\t/usr/lib/golang/src/runtime/asm_amd64.s:1571
+0x1 fp=0xc003c11fe8 sp=0xc003c11fe0 pc=0x4719c1\ncreated by github.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer.(*operator).start\n\t/build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer/queueinformer_operator.go:221
+0x557\n"
reason: Error
startedAt: "2022-06-29T07:56:16Z"
name: catalog-operator
ready: true
restartCount: 6
started: true
state:
running:
startedAt: "2022-06-29T08:11:55Z"
hostIP: 10.0.130.83
phase: Running
podIP: 10.130.0.121
podIPs: - ip: 10.130.0.121
qosClass: Burstable
startTime: "2022-06-29T02:46:23Z"
— Additional comment from jiazha@redhat.com on 2022-07-04 03:50:38 UTC —
Please ignore comment 4, 5, they are nothing with this issue.
— Additional comment from jiazha@redhat.com on 2022-07-04 06:57:24 UTC —
Check the `previous` log.
mac:~ jianzhang$ oc logs catalog-operator-f8ddcb57b-j5rf2 --previous
time="2022-07-03T23:49:00Z" level=info msg="log level info"
...
...
time="2022-07-04T03:43:25Z" level=info msg=syncing event=update reconciling="*v1alpha1.Subscription" selflink=
time="2022-07-04T03:43:25Z" level=info msg=syncing event=update reconciling="*v1alpha1.Subscription" selflink=
fatal error: concurrent map writes
fatal error: concurrent map writes
goroutine 559 [running]:
runtime.throw(
)
/usr/lib/golang/src/runtime/panic.go:992 +0x71 fp=0xc001f9c508 sp=0xc001f9c4d8 pc=0x43e9f1
runtime.mapassign_faststr(0x1d09880, 0xc0031847b0,
)
/usr/lib/golang/src/runtime/map_faststr.go:295 +0x38b fp=0xc001f9c570 sp=0xc001f9c508 pc=0x419b4b
github.com/operator-framework/operator-lifecycle-manager/pkg/controller/registry/reconciler.Pod(0xc001f4a900,
,
{0xc00132ccc0, 0x38},
{0xc003582d50, 0x13}, 0xc00452c1e0, 0xc0031847b0, 0x5, ...)/build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/controller/registry/reconciler/reconciler.go:227 +0xbdf fp=0xc001f9cbb0 sp=0xc001f9c570 pc=0x1a475ff
github.com/operator-framework/operator-lifecycle-manager/pkg/controller/registry/reconciler.(*grpcCatalogSourceDecorator).Pod(0xc001f9ce50, {0xc003582d50, 0x13}
)
/build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/controller/registry/reconciler/grpc.go:125 +0xf9 fp=0xc001f9cc30 sp=0xc001f9cbb0 pc=0x1a42c99
github.com/operator-framework/operator-lifecycle-manager/pkg/controller/registry/reconciler.(*GrpcRegistryReconciler).currentPodsWithCorrectImageAndSpec(0xc001f9ce68?,
,
{0xc003582d50, 0x13})
/build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/controller/registry/reconciler/grpc.go:190 +0x198 fp=0xc001f9ce48 sp=0xc001f9cc30 pc=0x1a437b8
github.com/operator-framework/operator-lifecycle-manager/pkg/controller/registry/reconciler.(*GrpcRegistryReconciler).CheckRegistryServer(0xc000bcbf80?, 0x493b77?)
/build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/controller/registry/reconciler/grpc.go:453 +0x4c fp=0xc001f9ce88 sp=0xc001f9ce48 pc=0x1a45fcc
github.com/operator-framework/operator-lifecycle-manager/pkg/controller/operators/catalog/subscription.(*catalogHealthReconciler).healthy(0x38ca8453?, 0xc001f4a900)
/build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/controller/operators/catalog/subscription/reconciler.go:196 +0x7e fp=0xc001f9ced0 sp=0xc001f9ce88 pc=0x1a4ae1e
github.com/operator-framework/operator-lifecycle-manager/pkg/controller/operators/catalog/subscription.(*catalogHealthReconciler).health(0x1bc37c0?, 0xc003e7e7e0, 0x8?)
/build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/controller/operators/catalog/subscription/reconciler.go:159 +0x2a fp=0xc001f9cf10 sp=0xc001f9ced0 pc=0x1a4ac8a
github.com/operator-framework/operator-lifecycle-manager/pkg/controller/operators/catalog/subscription.(*catalogHealthReconciler).catalogHealth(0xc000a59a90,
)
/build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/controller/operators/catalog/subscription/reconciler.go:137 +0x387 fp=0xc001f9d040 sp=0xc001f9cf10 pc=0x1a4a827
github.com/operator-framework/operator-lifecycle-manager/pkg/controller/operators/catalog/subscription.(*catalogHealthReconciler).Reconcile(0xc000a59a90,
/build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/controller/operators/catalog/subscription/reconciler.go:82 +0x21e fp=0xc001f9d118 sp=0xc001f9d040 pc=0x1a4a1be
github.com/operator-framework/operator-lifecycle-manager/pkg/lib/kubestate.ReconcilerChain.Reconcile({0xc0008e43c0?, 0x3, 0xc001f9d258?}, {0x237a238, 0xc0003fe5c0}
,
{0x7f9f6e5b3328?, 0xc0050f6490?})
/build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/lib/kubestate/kubestate.go:128 +0xc3 fp=0xc001f9d180 sp=0xc001f9d118 pc=0x1a36603
github.com/operator-framework/operator-lifecycle-manager/pkg/controller/operators/catalog/subscription.(*subscriptionSyncer).Sync(0xc0004dfd50,
/build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/controller/operators/catalog/subscription/syncer.go:77 +0x535 fp=0xc001f9d648 sp=0xc001f9d180 pc=0x1a4f535
github.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer.(*QueueInformer).Sync(...)
/build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer/queueinformer.go:35
github.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer.(*operator).processNextWorkItem(0xc00057a580, {0x237a238, 0xc0003fe5c0}
, 0xc000954720)
/build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer/queueinformer_operator.go:287 +0x57c fp=0xc001f9df70 sp=0xc001f9d648 pc=0x1a3ca7c
github.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer.(*operator).worker(0x0?,
, 0x0?)
/build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer/queueinformer_operator.go:231 +0x45 fp=0xc001f9dfb0 sp=0xc001f9df70 pc=0x1a3c4a5
github.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer.(*operator).start.func3()
/build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer/queueinformer_operator.go:221 +0x32 fp=0xc001f9dfe0 sp=0xc001f9dfb0 pc=0x1a3c152
runtime.goexit()
/usr/lib/golang/src/runtime/asm_amd64.s:1571 +0x1 fp=0xc001f9dfe8 sp=0xc001f9dfe0 pc=0x4719c1
created by github.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer.(*operator).start
/build/vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/lib/queueinformer/queueinformer_operator.go:221 +0x557
Seems like it failed at: https://github.com/operator-framework/operator-lifecycle-manager/blob/master/pkg/controller/registry/reconciler/reconciler.go#L227
— Additional comment from agreene@redhat.com on 2022-07-05 16:01:22 UTC —
As Jian pointed out, the catalog operator is failing due to a concurrent write at https://github.com/operator-framework/operator-lifecycle-manager/blob/master/pkg/controller/registry/reconciler/reconciler.go#L227.
This is happening because:
- The grpcCatalogSourceDecorator includes a pointer to a catalogSource: https://github.com/operator-framework/operator-lifecycle-manager/blob/0b7970c3d609e1f483a0ba68f599a5ffd2e139b5/pkg/controller/registry/reconciler/grpc.go#L32-L35
- The grpcCatalogSourceDecorator's Annotations function returns the catalogSource's annotations: https://github.com/operator-framework/operator-lifecycle-manager/blob/0b7970c3d609e1f483a0ba68f599a5ffd2e139b5/pkg/controller/registry/reconciler/grpc.go#L66
- The grpcCatalogSourceDecorator's Pod function uses the annotations returned from it's Annotations function: https://github.com/operator-framework/operator-lifecycle-manager/blob/0b7970c3d609e1f483a0ba68f599a5ffd2e139b5/pkg/controller/registry/reconciler/grpc.go#L125
- These catalogSource's annotations are updated within the Pod function here: https://github.com/operator-framework/operator-lifecycle-manager/blob/master/pkg/controller/registry/reconciler/reconciler.go#L227
Line 227 in the reconciler.go directly mutates the catalogSource's annotations. The grpcCatalogSourceDecorator's Annotations function should be returning a copy of the annotations or it should be created with a deepcopy of the catalogSource to avoid mutating an object in the lister cache.
This doesn't seem to be a blocker, but we should get a fix in swiftly.
— Additional comment from jiazha@redhat.com on 2022-07-13 05:02:04 UTC —
1, Create a cluster with the fixed PR via the Cluster-bot.
mac:~ jianzhang$ oc get clusterversion
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.11.0-0.ci.test-2022-07-13-022646-ci-ln-41fvni2-latest True False 126m Cluster version is 4.11.0-0.ci.test-2022-07-13-022646-ci-ln-41fvni2-latest
2, Subscribe some operators.
mac:~ jianzhang$ oc get sub -A
NAMESPACE NAME PACKAGE SOURCE CHANNEL
default etcd etcd community-operators singlenamespace-alpha
openshift-logging cluster-logging cluster-logging redhat-operators stable
openshift-operators-redhat elasticsearch-operator elasticsearch-operator redhat-operators stable
mac:~ jianzhang$ oc get sub -A
NAMESPACE NAME PACKAGE SOURCE CHANNEL
default etcd etcd community-operators singlenamespace-alpha
openshift-logging cluster-logging cluster-logging redhat-operators stable
openshift-operators-redhat elasticsearch-operator elasticsearch-operator redhat-operators stable
mac:~ jianzhang$
mac:~ jianzhang$
mac:~ jianzhang$ oc get csv -n openshift-operators-redhat
NAME DISPLAY VERSION REPLACES PHASE
elasticsearch-operator.5.4.2 OpenShift Elasticsearch Operator 5.4.2 Succeeded
mac:~ jianzhang$ oc get csv -n openshift-logging
NAME DISPLAY VERSION REPLACES PHASE
cluster-logging.5.4.2 Red Hat OpenShift Logging 5.4.2 Succeeded
elasticsearch-operator.5.4.2 OpenShift Elasticsearch Operator 5.4.2 Succeeded
mac:~ jianzhang$ oc get csv -n default
NAME DISPLAY VERSION REPLACES PHASE
elasticsearch-operator.5.4.2 OpenShift Elasticsearch Operator 5.4.2 Succeeded
etcdoperator.v0.9.4 etcd 0.9.4 etcdoperator.v0.9.2 Succeeded
3, Check OLM catalog-operator pods status.
mac:~ jianzhang$ oc get pods
NAME READY STATUS RESTARTS AGE
catalog-operator-546db7cdf5-7pldg 1/1 Running 0 145m
collect-profiles-27628110-lr2nv 0/1 Completed 0 30m
collect-profiles-27628125-br8b8 0/1 Completed 0 15m
collect-profiles-27628140-m64gp 0/1 Completed 0 38s
olm-operator-754d7f6f56-26qhw 1/1 Running 0 145m
package-server-manager-77d5cbf696-v9w4p 1/1 Running 0 145m
packageserver-6884994d98-2smtw 1/1 Running 0 143m
packageserver-6884994d98-5d7jg 1/1 Running 0 143m
mac:~ jianzhang$ oc logs catalog-operator-546db7cdf5-7pldg --previous
Error from server (BadRequest): previous terminated container "catalog-operator" in pod "catalog-operator-546db7cdf5-7pldg" not found
No terminated container. catalog-operator works well. Verify it.
— Additional comment from aos-team-art-private@redhat.com on 2022-07-13 22:50:04 UTC —
Elliott changed bug status from MODIFIED to ON_QA.
This bug is expected to ship in the next 4.12 release.
— Additional comment from jiazha@redhat.com on 2022-07-18 07:23:08 UTC —
Changed the status to VERIFIED based on comment 10.