-
Bug
-
Resolution: Unresolved
-
Normal
-
None
-
4.15.z
-
None
-
Quality / Stability / Reliability
-
False
-
-
None
-
None
-
None
-
None
-
None
-
None
-
Hypershift Sprint 258, Hypershift Sprint 259
-
2
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
Perfoming a machinepool upgrade on 503 nodes ROSA-HCP from OCP 4.15.17 --> OCP 4.15.22, with maxUnavailable set to 50% while the cluster is pre-loaded with cluster-density-v2 workload (https://kube-burner.github.io/kube-burner-ocp/latest/#cluster-density-v2)
Version-Release number of selected component (if applicable):
Control-Plane: 4.15.22 Machinepool: Upgrading from 4.15.17 --> 4.15.22
Steps to Reproduce:
1. kube-burner-ocp cluster-density-v2 --iterations=4509 --churn=false --gc=false 2. rosa edit machinepool --max-surge=$0% --max-unavailable=50% --cluster=2cs9mdk9eeopmhqf5f69n48ojo8qofc0 <worker-0|worker-1|worker-2> 3. rosa upgrade machinepool <worker-0|worker-1|worker-2> -y -c 2cs9mdk9eeopmhqf5f69n48ojo8qofc0 --version 4.15.22
Actual results:
Upgrade not progressing even after 3hours
Expected results:
Upgrade to be successful
Additional info:
======================================================= $ oc get co NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE console 4.15.22 True False False 14h csi-snapshot-controller 4.15.22 True False False 13h dns 4.15.22 True True False 14h DNS "default" reports Progressing=True: "Have 488 available node-resolver pods, want 493." image-registry 4.15.22 True True False 14h Progressing: The deployment has not completed... ingress 4.15.22 True True False 4h22m ingresscontroller "default" is progressing: IngressControllerProgressing: One or more status conditions indicate progressing: DeploymentRollingOut=True (DeploymentRollingOut: Waiting for router deployment rollout to finish: 1 of 2 updated replica(s) are available...... insights 4.15.22 True False False 14h kube-apiserver 4.15.22 True False False 14h kube-controller-manager 4.15.22 True False False 14h kube-scheduler 4.15.22 True False False 14h kube-storage-version-migrator 4.15.22 True False False 6h24m monitoring 4.15.22 Unknown True Unknown 4h2m Rolling out the stack. network 4.15.22 True True True 14h DaemonSet "/openshift-multus/multus" rollout is not making progress - pod multus-2hxt5 is in CrashLoopBackOff State... node-tuning 4.15.22 True True False 3h52m Waiting for 93/493 Profiles to be applied openshift-apiserver 4.15.22 True False False 14h openshift-controller-manager 4.15.22 True False False 14h openshift-samples 4.15.22 True False False 13h operator-lifecycle-manager 4.15.22 True False False 14h operator-lifecycle-manager-catalog 4.15.22 True False False 14h operator-lifecycle-manager-packageserver 4.15.22 True False False 14h service-ca 4.15.22 True False False 14h storage 4.15.22 True True False 13h AWSEBSCSIDriverOperatorCRProgressing: AWSEBSDriverNodeServiceControllerProgressing: Waiting for DaemonSet to deploy node pods $ ======================================================= $ oc logs etcd-2 -c etcd {"level":"warn","ts":"2024-08-01T12:49:27.855883Z","caller":"etcdserver/util.go:123","msg":"failed to apply request","took":"47.807µs","request":"header:<ID:2186938574361351123 username:\"etcd-client\" auth_revision:1 > txn:<compare:<target:MOD key:\"/kubernetes.io/pods/cluster-density-v2-3069/client-2-84b8b6777-mt8c2\" mod_revision:9443487 > success:<request_put:<key:\"/kubernetes.io/pods/cluster-density-v2-3069/client-2-84b8b6777-mt8c2\" value_size:8947 >> failure:<request_range:<key:\"/kubernetes.io/pods/cluster-density-v2-3069/client-2-84b8b6777-mt8c2\" > >>","response":"size:20","error":"etcdserver: no space"} $ =======================================================