-
Bug
-
Resolution: Done
-
Undefined
-
ACM 2.8.0
-
False
-
None
-
False
-
-
-
ACM Sprint 26, ACM Sprint 27
-
Moderate
-
No
Description of problem:
While deploying 3000+ SNOs with ACM and ZTP, the multicluster-operator was crashlooping at the conclusion of the test.
Test:
- Attempted to install 3591 SNOs
- Successfully installed 3114
- Managed 2419 (Separate issue with the managedcluster-import-controller OOMing)
Version-Release number of selected component (if applicable):
2.7.0-DOWNSTREAM-2022-11-25-10-53-02
OCP 4.11.13 (Hub and managedclusters)
How reproducible:
Steps to Reproduce:
- ...
Actual results:
Expected results:
Additional info:
# oc get po -n open-cluster-management multiclusterhub-operator-7c6cd849db-tg7hd NAME READY STATUS RESTARTS AGE multiclusterhub-operator-7c6cd849db-tg7hd 1/1 Running 133 (6m13s ago) 28h # oc describe po -n open-cluster-management multiclusterhub-operator-7c6cd849db-tg7hd Name: multiclusterhub-operator-7c6cd849db-tg7hd Namespace: open-cluster-management Priority: 0 Node: e27-h03-000-r650/fc00:1002::6 Start Time: Thu, 01 Dec 2022 14:20:42 +0000 Labels: name=multiclusterhub-operator pod-template-hash=7c6cd849db Annotations: alm-examples: [{"apiVersion": "operator.open-cluster-management.io/v1", "kind": "MultiClusterHub", "metadata": {"name": "multiclusterhub", "namespace": ... capabilities: Seamless Upgrades categories: Integration & Delivery certified: true createdAt: 2022-11-25T19:49:44Z description: Advanced provisioning and management of OpenShift and Kubernetes clusters k8s.ovn.org/pod-networks: {"default":{"ip_addresses":["fd01:0:0:1::6b/64"],"mac_address":"0a:58:cc:87:fa:92","gateway_ips":["fd01:0:0:1::1"],"ip_address":"fd01:0:0:... k8s.v1.cni.cncf.io/network-status: [{ "name": "ovn-kubernetes", "interface": "eth0", "ips": [ "fd01:0:0:1::6b" ], "mac": "0a:58:cc:87:fa:92", "default": true, "dns": {} }] k8s.v1.cni.cncf.io/networks-status: [{ "name": "ovn-kubernetes", "interface": "eth0", "ips": [ "fd01:0:0:1::6b" ], "mac": "0a:58:cc:87:fa:92", "default": true, "dns": {} }] olm.operatorGroup: default olm.operatorNamespace: open-cluster-management olm.skipRange: >=2.6.0 <2.7.0 olm.targetNamespaces: open-cluster-management openshift.io/scc: restricted-v2 operatorframework.io/initialization-resource: {"apiVersion":"operator.open-cluster-management.io/v1", "kind":"MultiClusterHub","metadata":{"name":"multiclusterhub","namespace":"open-cl... operatorframework.io/properties: {"properties":[{"type":"olm.gvk","value":{"group":"apps.open-cluster-management.io","kind":"PlacementRule","version":"v1"}},{"type":"olm.g... operatorframework.io/suggested-namespace: open-cluster-management operators.openshift.io/infrastructure-features: ["disconnected", "proxy-aware", "fips"] operators.openshift.io/valid-subscription: ["OpenShift Platform Plus", "Red Hat Advanced Cluster Management for Kubernetes"] operators.operatorframework.io/internal-objects: ["observabilityaddons.observability.open-cluster-management.io", "observatoria.core.observatorium.io"] seccomp.security.alpha.kubernetes.io/pod: runtime/default support: Red Hat Status: Running IP: fd01:0:0:1::6b IPs: IP: fd01:0:0:1::6b Controlled By: ReplicaSet/multiclusterhub-operator-7c6cd849db Containers: multiclusterhub-operator: Container ID: cri-o://e6a51a1626b4bde5c018c05b49fc854aa60da9ecd8658db531688d291267fdb4 Image: registry.redhat.io/rhacm2/multiclusterhub-rhel8@sha256:bc70cef3730f6cd0b5f0e39f211f717f746b777bfbf85c758641ad7c2ab22a7d Image ID: registry.redhat.io/rhacm2/multiclusterhub-rhel8@sha256:231974e2c97a2047053de31aae355c516bb32bf29688d743e7b8966b58f2ca9b Port: <none> Host Port: <none> Command: multiclusterhub-operator Args: --leader-elect State: Running Started: Fri, 02 Dec 2022 19:02:13 +0000 Last State: Terminated Reason: Error Exit Code: 1 Started: Fri, 02 Dec 2022 18:54:40 +0000 Finished: Fri, 02 Dec 2022 18:57:08 +0000 Ready: True Restart Count: 133 Limits: cpu: 100m memory: 4Gi Requests: cpu: 100m memory: 256Mi Liveness: http-get http://:8081/healthz delay=15s timeout=1s period=20s #success=1 #failure=3 Readiness: http-get http://:8081/readyz delay=5s timeout=1s period=10s #success=1 #failure=3 Environment: CRDS_PATH: /usr/local/templates/crds TEMPLATES_PATH: /usr/local/templates POD_NAMESPACE: open-cluster-management (v1:metadata.namespace) OPERAND_IMAGE_GOVERNANCE_POLICY_ADDON_CONTROLLER: registry.redhat.io/rhacm2/acm-governance-policy-addon-controller-rhel8@sha256:da7b404cb64dcbe022f211eab01ee5a22440a4d52024aad3237059053ab23508 OPERAND_IMAGE_GOVERNANCE_POLICY_FRAMEWORK_ADDON: registry.redhat.io/rhacm2/acm-governance-policy-framework-addon-rhel8@sha256:ea028200d784c9c77b29c558708b2435743b6228aa96e6b4ca138267bcfccfb2 OPERAND_IMAGE_ACM_MUST_GATHER: registry.redhat.io/rhacm2/acm-must-gather-rhel8@sha256:37fe5af5ae50c835f51863f52b2fabb5f4af7319d8e0d474a7ae5e585bf237b4 OPERAND_IMAGE_PROMETHEUS_CONFIG_RELOADER: registry.redhat.io/rhacm2/acm-prometheus-config-reloader-rhel8@sha256:7341066cbf30ce79a4400810b0350c7b3bd34948df7fe25eca822cafb2ebb878 OPERAND_IMAGE_PROMETHEUS_OPERATOR: registry.redhat.io/rhacm2/acm-prometheus-rhel8@sha256:79e945328bdf18d2c2487d91388fc0c1ea9548e68d2b150adead6f4930ca625e OPERAND_IMAGE_SEARCH_INDEXER: registry.redhat.io/rhacm2/acm-search-indexer-rhel8@sha256:4414569b219323d47fc4485c1cd4325745bd2e7706dad9728eb85a948ffc6afb OPERAND_IMAGE_SEARCH_V2_API: registry.redhat.io/rhacm2/acm-search-v2-api-rhel8@sha256:f87f7e273cabe09e56b521ab8c319a9d53c972593a86bc357d11f80bbdb4b573 OPERAND_IMAGE_SEARCH_V2_OPERATOR: registry.redhat.io/rhacm2/acm-search-v2-rhel8@sha256:11c159f9e2465cce454ee4935868f87c897c8e36d642c47b98d738f80c2b415c OPERAND_IMAGE_VOLSYNC_ADDON_CONTROLLER: registry.redhat.io/rhacm2/acm-volsync-addon-controller-rhel8@sha256:39b76baacf5536fefb008436c6d17e91b848210e3f82fb27009d717932b32797 OPERAND_IMAGE_CERT_POLICY_CONTROLLER: registry.redhat.io/rhacm2/cert-policy-controller-rhel8@sha256:0ec668c671ce8a7564fd195edd3504488182514bb6560d069a0485162cb2aa2e OPERAND_IMAGE_CLUSTER_BACKUP_CONTROLLER: registry.redhat.io/rhacm2/cluster-backup-rhel8-operator@sha256:583a0b67164a41098288becc926cf5cbfa1fb7b38227159d371faad15cc20688 OPERAND_IMAGE_CONFIG_POLICY_CONTROLLER: registry.redhat.io/rhacm2/config-policy-controller-rhel8@sha256:3b56ae7f6d0e5758420df574f5004e472a7ae800dc11d6df477e8841275a4e83 OPERAND_IMAGE_CONSOLE: registry.redhat.io/rhacm2/console-rhel8@sha256:1589435ff1725bea8382ee7506a79ce2eeb1eb02df0a4dde123769beae0128dd OPERAND_IMAGE_ENDPOINT_MONITORING_OPERATOR: registry.redhat.io/rhacm2/endpoint-monitoring-rhel8-operator@sha256:a9f355ba541c4741c924b676d0dfc213555f6ce235172fae43b9693f167c30f6 OPERAND_IMAGE_GOVERNANCE_POLICY_PROPAGATOR: registry.redhat.io/rhacm2/governance-policy-propagator-rhel8@sha256:c5a03354d9f3e80114780ab48b2c082c7bf86928ce6c6f2763da364c0fac6df7 OPERAND_IMAGE_GRAFANA: registry.redhat.io/rhacm2/acm-grafana-rhel8@sha256:65b93165da8a40e6a445185dd3514bfc70e0a5bae1889fd11738ae3651cbea49 OPERAND_IMAGE_GRAFANA_DASHBOARD_LOADER: registry.redhat.io/rhacm2/grafana-dashboard-loader-rhel8@sha256:d0eb36b75a7f6e760c9fc27453e26705775877881f7d042ffd2c86648c1cf1b0 OPERAND_IMAGE_IAM_POLICY_CONTROLLER: registry.redhat.io/rhacm2/iam-policy-controller-rhel8@sha256:ae0b61da1390a1700d17d74e16120267d92a1b9638dd8447ebef708ec4956b58 OPERAND_IMAGE_INSIGHTS_CLIENT: registry.redhat.io/rhacm2/insights-client-rhel8@sha256:5ef458d05ab4f1e420913eb4b52275f0c40f821f36538343e859814337429d11 OPERAND_IMAGE_INSIGHTS_METRICS: registry.redhat.io/rhacm2/insights-metrics-rhel8@sha256:047937a09c0020ec86d4f17208602104924b0c11486b87a1ff63096c819f1690 OPERAND_IMAGE_KLUSTERLET_ADDON_CONTROLLER: registry.redhat.io/rhacm2/klusterlet-addon-controller-rhel8@sha256:8e80689ddd45412b0995c160057dd10ed67a8bf98ac026f50895808c2c68c878 OPERAND_IMAGE_KUBE_RBAC_PROXY: registry.redhat.io/rhacm2/kube-rbac-proxy-rhel8@sha256:f445bacd703a2cbfb21e8654be0645ab7cbad6d6f1c792e8dd090663c85160c9 OPERAND_IMAGE_KUBE_STATE_METRICS: registry.redhat.io/rhacm2/kube-state-metrics-rhel8@sha256:0eceb51d96096562ae210cadb261941c875a8426392f2b86fd2455989cf392b1 OPERAND_IMAGE_MANAGEMENT_INGRESS: registry.redhat.io/rhacm2/management-ingress-rhel8@sha256:0be7c37a624bd176cb6aac7d4821f15afc55038241134280b2e09085699cc92d OPERAND_IMAGE_MEMCACHED: registry.redhat.io/rhacm2/memcached-rhel8@sha256:f6bde6552e823788224ed359703ddb4b72c9e79f3011b79519c2595fee2ef38a OPERAND_IMAGE_MEMCACHED_EXPORTER: registry.redhat.io/rhacm2/memcached-exporter-rhel8@sha256:1eee3f3a7b7d4ff4bca63d1e8b73f59a086049c71f33db8d2e6e58b5b7cbc922 OPERAND_IMAGE_METRICS_COLLECTOR: registry.redhat.io/rhacm2/metrics-collector-rhel8@sha256:81437a39c467fe7d92afb429d46cb87a99d2aa8e39c1749a429df0177025647c OPERAND_IMAGE_MULTICLOUD_INTEGRATIONS: registry.redhat.io/rhacm2/multicloud-integrations-rhel8@sha256:91096e12f730173940e8e5ad0ba029ebdbdd040f87e33a5978859e9c88ce9130 OPERAND_IMAGE_MULTICLUSTER_OBSERVABILITY_OPERATOR: registry.redhat.io/rhacm2/multicluster-observability-rhel8-operator@sha256:afbd1686dd91235e05f12790b0a7dd9aaed19602662b040bc9672be94be61ac0 OPERAND_IMAGE_MULTICLUSTER_OPERATORS_APPLICATION: registry.redhat.io/rhacm2/multicluster-operators-application-rhel8@sha256:c35e9c2d739e22f4bbf0b1185523556681f248d9549a90a4e39f86c237e31f1e OPERAND_IMAGE_MULTICLUSTER_OPERATORS_CHANNEL: registry.redhat.io/rhacm2/multicluster-operators-channel-rhel8@sha256:e10dd7a7b0ba8852bf949115fc1f21d249901c5770ee6128703568a737963c14 OPERAND_IMAGE_MULTICLUSTER_OPERATORS_SUBSCRIPTION: registry.redhat.io/rhacm2/multicluster-operators-subscription-rhel8@sha256:cbe14410b02784110fb66a6226fc9e9de26909500d38a8ce3489f8dc3e24c1ea OPERAND_IMAGE_MULTICLUSTERHUB_REPO: registry.redhat.io/rhacm2/multiclusterhub-repo-rhel8@sha256:64419500c522715cc135740d8b903fde1bfbe82e4a2bb200b0a93f246cb955f8 OPERAND_IMAGE_NODE_EXPORTER: registry.redhat.io/rhacm2/node-exporter-rhel8@sha256:f9c27b68a55edf80a64f3f6d5b26ef780595392d001f2c72f3b92fbaa6c813df OPERAND_IMAGE_OBSERVATORIUM: registry.redhat.io/rhacm2/observatorium-rhel8@sha256:548555ca1b6ad5a8cc6b11428de8a3c444e59bf5cf76c6420ba5b62d03216128 OPERAND_IMAGE_OBSERVATORIUM_OPERATOR: registry.redhat.io/rhacm2/observatorium-rhel8-operator@sha256:4fca25a41a48644b8d062438f15ee7fb8b390d85be8d852c746605c57c398667 OPERAND_IMAGE_PROMETHEUS: registry.redhat.io/rhacm2/prometheus-rhel8@sha256:d1892a980c18f17a796310d09d3e788b9e183252cebc3f21f7c3e777714645e2 OPERAND_IMAGE_PROMETHEUS_ALERTMANAGER: registry.redhat.io/rhacm2/prometheus-alertmanager-rhel8@sha256:20c4e4f7732697bf533cde513ad4808111887af5afe83b8199ac38f308fb25cf OPERAND_IMAGE_RBAC_QUERY_PROXY: registry.redhat.io/rhacm2/rbac-query-proxy-rhel8@sha256:0636ef27a87daec0cf096c7323f86c0cf3e918a463142cffde0647e6847033ca OPERAND_IMAGE_SEARCH_COLLECTOR: registry.redhat.io/rhacm2/search-collector-rhel8@sha256:58bec5d98531921e5ab3744375dbf07899f85dd5ee5a6d3c167cbac2557e26a8 OPERAND_IMAGE_SUBMARINER_ADDON: registry.redhat.io/rhacm2/submariner-addon-rhel8@sha256:7fc05a4cc7ce472bd03c56946d36234cde09bbef5adc1e08dc09308b781cdf44 OPERAND_IMAGE_THANOS: registry.redhat.io/rhacm2/thanos-rhel8@sha256:f5fc7e4351a8042aa53659a2393441f6a92e5e8d2023a0a5b0d7a773d2c24889 OPERAND_IMAGE_THANOS_RECEIVE_CONTROLLER: registry.redhat.io/rhacm2/thanos-receive-controller-rhel8@sha256:692685d4aa8e10b1845f0d7680a1710a9c082b551b438082b41fd4717b6c4f37 OPERAND_IMAGE_MULTICLUSTERHUB_OPERATOR: registry.redhat.io/rhacm2/multiclusterhub-rhel8@sha256:bc70cef3730f6cd0b5f0e39f211f717f746b777bfbf85c758641ad7c2ab22a7d OPERAND_IMAGE_OAUTH_PROXY_48: registry.redhat.io/openshift4/ose-oauth-proxy@sha256:9221482cdee0e989f2f8211ae0cb8aede828cee2e5aed3776b5a773d12910d47 OPERAND_IMAGE_OAUTH_PROXY_49_AND_UP: registry.redhat.io/openshift4/ose-oauth-proxy@sha256:e94ab35f4e71b6b91b1c00381446588ccc7d075c3e5e6ab2362b9efa5f987fe9 OPERAND_IMAGE_PROMETHEUS-ALERTMANAGER: registry.redhat.io/openshift4/ose-prometheus-alertmanager@sha256:6a6356ed2f670802e0a0e27fe00cd59378eb62a2966baa828a8ac86fe7efc29b OPERAND_IMAGE_PROMETHEUS-CONFIG-RELOADER: registry.redhat.io/openshift4/ose-configmap-reloader@sha256:a8463accd3b01ff25023973f29f538d3273b1f50156b2b84c3a8894f1b89fc43 OPERAND_IMAGE_POSTGRESQL_13: registry.redhat.io/rhel8/postgresql-13@sha256:826a1a75f4186f36217b5c1f1aa270838c4b55b1021d33a4876566bfc3b8e629 OPERATOR_VERSION: 2.7.0 OPERATOR_PACKAGE: advanced-cluster-management OPERATOR_CONDITION_NAME: advanced-cluster-management.v2.7.0 Mounts: /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-slj8s (ro) Conditions: Type Status Initialized True Ready True ContainersReady True PodScheduled True Volumes: kube-api-access-slj8s: Type: Projected (a volume that contains injected data from multiple sources) TokenExpirationSeconds: 3607 ConfigMapName: kube-root-ca.crt ConfigMapOptional: <nil> DownwardAPI: true ConfigMapName: openshift-service-ca.crt ConfigMapOptional: <nil> QoS Class: Burstable Node-Selectors: <none> Tolerations: node.kubernetes.io/memory-pressure:NoSchedule op=Exists node.kubernetes.io/not-ready:NoExecute op=Exists for 300s node.kubernetes.io/unreachable:NoExecute op=Exists for 300s Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Pulled 54m (x126 over 16h) kubelet Container image "registry.redhat.io/rhacm2/multiclusterhub-rhel8@sha256:bc70cef3730f6cd0b5f0e39f211f717f746b777bfbf85c758641ad7c2ab22a7d" already present on machine Warning BackOff 4m53s (x3144 over 16h) kubelet Back-off restarting failed container
Errors in the log:
1.6700054871616313e+09 INFO Starting EventSource {"controller": "multiclusterhub", "controllerGroup": "operator.open-cluster-management.io", "controllerKind": "MultiClusterHub", "source": "kind source: *v1.ClusterVersion"} 1.6700054871616352e+09 INFO Starting Controller {"controller": "multiclusterhub", "controllerGroup": "operator.open-cluster-management.io", "controllerKind": "MultiClusterHub"} W1202 18:25:36.639728 1 reflector.go:324] pkg/mod/k8s.io/client-go@v0.24.1/tools/cache/reflector.go:167: failed to list *v1.Secret: stream error when reading response body, may be caused by closed connection. Please retry. Original error: stream error: stream ID 521; INTERNAL_ERROR; received from peer I1202 18:25:36.639791 1 trace.go:205] Trace[1683922013]: "Reflector ListAndWatch" name:pkg/mod/k8s.io/client-go@v0.24.1/tools/cache/reflector.go:167 (02-Dec-2022 18:24:30.767) (total time: 65872ms): Trace[1683922013]: ---"Objects listed" error:stream error when reading response body, may be caused by closed connection. Please retry. Original error: stream error: stream ID 521; INTERNAL_ERROR; received from peer 65872ms (18:25:36.639) Trace[1683922013]: [1m5.872744554s] [1m5.872744554s] END E1202 18:25:36.639803 1 reflector.go:138] pkg/mod/k8s.io/client-go@v0.24.1/tools/cache/reflector.go:167: Failed to watch *v1.Secret: failed to list *v1.Secret: stream error when reading response body, may be caused by closed connection. Please retry. Original error: stream error: stream ID 521; INTERNAL_ERROR; received from peer I1202 18:25:47.541367 1 trace.go:205] Trace[757659167]: "Reflector ListAndWatch" name:pkg/mod/k8s.io/client-go@v0.24.1/tools/cache/reflector.go:167 (02-Dec-2022 18:24:47.238) (total time: 60302ms): Trace[757659167]: ---"Objects listed" error:<nil> 60100ms (18:25:47.338) Trace[757659167]: [1m0.30278354s] [1m0.30278354s] END W1202 18:26:42.343707 1 reflector.go:324] pkg/mod/k8s.io/client-go@v0.24.1/tools/cache/reflector.go:167: failed to list *v1.Secret: stream error when reading response body, may be caused by closed connection. Please retry. Original error: stream error: stream ID 655; INTERNAL_ERROR; received from peer I1202 18:26:42.343752 1 trace.go:205] Trace[1257424412]: "Reflector ListAndWatch" name:pkg/mod/k8s.io/client-go@v0.24.1/tools/cache/reflector.go:167 (02-Dec-2022 18:25:37.837) (total time: 64506ms): Trace[1257424412]: ---"Objects listed" error:stream error when reading response body, may be caused by closed connection. Please retry. Original error: stream error: stream ID 655; INTERNAL_ERROR; received from peer 64506ms (18:26:42.343) Trace[1257424412]: [1m4.506520969s] [1m4.506520969s] END E1202 18:26:42.343763 1 reflector.go:138] pkg/mod/k8s.io/client-go@v0.24.1/tools/cache/reflector.go:167: Failed to watch *v1.Secret: failed to list *v1.Secret: stream error when reading response body, may be caused by closed connection. Please retry. Original error: stream error: stream ID 655; INTERNAL_ERROR; received from peer 1.6700056072374518e+09 ERROR Could not wait for Cache to sync {"controller": "multiclusterhub", "controllerGroup": "operator.open-cluster-management.io", "controllerKind": "MultiClusterHub", "error": "failed to wait for multiclusterhub caches to sync: timed out waiting for cache to be synced"} sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2 /remote-source/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.3/pkg/internal/controller/controller.go:215 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start /remote-source/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.3/pkg/internal/controller/controller.go:241 sigs.k8s.io/controller-runtime/pkg/manager.(*runnableGroup).reconcile.func1 /remote-source/deps/gomod/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.3/pkg/manager/runnable_group.go:219 1.6700056072376776e+09 INFO Stopping and waiting for non leader election runnables 1.6700056072376957e+09 INFO Stopping and waiting for leader election runnables 1.6700056072377238e+09 INFO Stopping and waiting for caches 1.6700056072378387e+09 INFO Stopping and waiting for webhooks 1.6700056072378528e+09 INFO Wait completed, proceeding to shutdown the manager 1.6700056072379096e+09 ERROR setup problem running manager {"error": "failed to wait for multiclusterhub caches to sync: timed out waiting for cache to be synced"} main.main /remote-source/app/main.go:195 runtime.main /usr/lib/golang/src/runtime/proc.go:25
- is cloned by
-
ACM-2551 upgrade from 2.6 to 2.7 multiclusterhub-operator crash loops - timed out waiting for cache to be synced
- Closed