-
Bug
-
Resolution: Cannot Reproduce
-
Normal
-
None
-
4.16.z
-
Moderate
-
None
-
False
-
Description of problem:
We're deploying 3500+ SNOs for ACM 2.11 ZTP Perf/Scale test. When deploying 4.16.2 we observed 7 SNOs failed to install with multiple operators failure as an example below. This seems new to 4.16.2, as for 4.16.1 we mainly observe API Down issue that we opened another issue. AI install log and must-gather collected, can be accessed from here: https://drive.google.com/drive/u/0/folders/1t1CTgipWq3yeyIi2Tl_iP-KvMQFCrEL_?ths=true Redhatters should have viewer permission for the folder. # oc --kubeconfig /root/hv-vm/kc/vm00962/kubeconfig get clusteroperators NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE authentication 4.16.2 False False True 10h OAuthServerDeploymentAvailable: no oauth-openshift.openshift-authentication pods available on any node.... config-operator 4.16.2 True False False 10h dns 4.16.2 True False False 10h etcd 4.16.2 True False False 10h ingress 4.16.2 True False False 10h kube-apiserver 4.16.2 True False False 10h kube-controller-manager 4.16.2 True False True 10h GarbageCollectorDegraded: error fetching rules: Get "https://thanos-querier.openshift-monitoring.svc:9091/api/v1/rules": dial tcp: lookup thanos-querier.openshift-monitoring.svc on [fd02::a]:53: server misbehaving kube-scheduler 4.16.2 True False False 10h kube-storage-version-migrator 4.16.2 True False False 10h machine-approver 4.16.2 True False False 10h machine-config 4.16.2 True False False 10h monitoring False True True 61s UpdatingPrometheusOperator: reconciling Prometheus Operator Deployment failed: updating Deployment object failed: waiting for DeploymentRollout of openshift-monitoring/prometheus-operator: context deadline exceeded: client rate limiter Wait returned an error: rate: Wait(n=1) would exceed context deadline network 4.16.2 True False False 10h node-tuning 4.16.2 True False False 10h openshift-apiserver 4.16.2 True False False 10h openshift-controller-manager 4.16.2 True False False 10h operator-lifecycle-manager 4.16.2 True False False 10h operator-lifecycle-manager-catalog 4.16.2 True False False 10h operator-lifecycle-manager-packageserver 4.16.2 True False False 10h service-ca 4.16.2 True False False 10h
Version-Release number of selected component (if applicable):
How reproducible:
Steps to Reproduce:
1. 2. 3.
Actual results:
Expected results:
Additional info: