-
Bug
-
Resolution: Obsolete
-
Undefined
-
None
-
4.14, 4.14.z
-
Quality / Stability / Reliability
-
False
-
-
None
-
Important
-
No
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
I'm testing 4.14 z stream rollback by installing 4.14.0, upgrading to 4.14.0-0.nightly-2024-03-13-015516, then rolling back to 4.14.0. The cluster gets stuck on etcd and kube-apiserver when rolling back to 4.14.0.
# oc get node
NAME STATUS ROLES AGE VERSION
ip-10-0-58-12.us-east-2.compute.internal NotReady,SchedulingDisabled control-plane,master 11h v1.27.11+d8e449a
ip-10-0-62-170.us-east-2.compute.internal NotReady,SchedulingDisabled worker 10h v1.27.11+d8e449a
ip-10-0-64-20.us-east-2.compute.internal Ready worker 10h v1.27.11+d8e449a
ip-10-0-68-151.us-east-2.compute.internal Ready control-plane,master 11h v1.27.11+d8e449a
ip-10-0-79-33.us-east-2.compute.internal Ready control-plane,master 11h v1.27.11+d8e449a
ip-10-0-79-89.us-east-2.compute.internal Ready worker 10h v1.27.11+d8e449a
# oc get co
NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE
authentication 4.14.0 True False True 2m7s APIServerDeploymentDegraded: 1 of 3 requested instances are unavailable for apiserver.openshift-oauth-apiserver ()...
baremetal 4.14.0 True False False 11h
cloud-controller-manager 4.14.0 True False False 11h
cloud-credential 4.14.0 True False False 11h
cluster-autoscaler 4.14.0 True False False 11h
config-operator 4.14.0 True False False 11h
console 4.14.0 True False False 4h5m
control-plane-machine-set 4.14.0 False True False 6h39m Missing 1 available replica(s)
csi-snapshot-controller 4.14.0 True False False 11h
dns 4.14.0 True True False 11h DNS "default" reports Progressing=True: "Have 4 available node-resolver pods, want 6."
etcd 4.14.0 True False True 11h NodeControllerDegraded: The master nodes not ready: node "ip-10-0-58-12.us-east-2.compute.internal" not ready since 2024-03-19 07:39:35 +0000 UTC because NodeStatusUnknown (Kubelet stopped posting node status.)
image-registry 4.14.0 True True False 10h Progressing: The deployment has not completed...
ingress 4.14.0 True False False 10h
insights 4.14.0 True False False 10h
kube-apiserver 4.14.0 True False True 11h NodeControllerDegraded: The master nodes not ready: node "ip-10-0-58-12.us-east-2.compute.internal" not ready since 2024-03-19 07:39:35 +0000 UTC because NodeStatusUnknown (Kubelet stopped posting node status.)
kube-controller-manager 4.14.0 True False True 11h NodeControllerDegraded: The master nodes not ready: node "ip-10-0-58-12.us-east-2.compute.internal" not ready since 2024-03-19 07:39:35 +0000 UTC because NodeStatusUnknown (Kubelet stopped posting node status.)
kube-scheduler 4.14.0 True False True 11h NodeControllerDegraded: The master nodes not ready: node "ip-10-0-58-12.us-east-2.compute.internal" not ready since 2024-03-19 07:39:35 +0000 UTC because NodeStatusUnknown (Kubelet stopped posting node status.)
kube-storage-version-migrator 4.14.0 True False False 9h
machine-api 4.14.0 True False False 10h
machine-approver 4.14.0 True False False 11h
machine-config 4.14.0-0.nightly-2024-03-13-015516 False True True 6h2m Cluster not available for [{operator 4.14.0-0.nightly-2024-03-13-015516}]: failed to apply machine config daemon manifests: error during waitForDaemonsetRollout: [context deadline exceeded, daemonset machine-config-daemon is not ready. status: (desired: 6, updated: 6, ready: 4, unavailable: 2)]
marketplace 4.14.0 True False False 11h
monitoring 4.14.0 False True True 6h29m reconciling node-exporter DaemonSet failed: updating DaemonSet object failed: waiting for DaemonSetRollout of openshift-monitoring/node-exporter: context deadline exceeded
network 4.14.0 True True False 11h DaemonSet "/openshift-multus/multus-additional-cni-plugins" is not available (awaiting 2 nodes)...
node-tuning 4.14.0 True True False 6h55m Working towards "4.14.0"
openshift-apiserver 4.14.0 True False True 36m APIServerDeploymentDegraded: 1 of 3 requested instances are unavailable for apiserver.openshift-apiserver ()
openshift-controller-manager 4.14.0 True False False 10h
openshift-samples 4.14.0 True False False 6h56m
operator-lifecycle-manager 4.14.0 True False False 11h
operator-lifecycle-manager-catalog 4.14.0 True False False 11h
operator-lifecycle-manager-packageserver 4.14.0 True False False 64m
service-ca 4.14.0 True False False 11h
storage 4.14.0 True True False 11h AWSEBSCSIDriverOperatorCRProgressing: AWSEBSDriverNodeServiceControllerProgressing: Waiting for DaemonSet to deploy node pods
Version-Release number of selected component (if applicable):
4.14.0-0.nightly-2024-03-13-015516
How reproducible:
100%
Steps to Reproduce:
1. Install 4.14.0 aws cluster
2. Upgrade to 4.14.0-0.nightly-2024-03-13-015516
3. Roll back to 4.14.0
Actual results:
Rollback gets stuck
Expected results:
Rollback passes
Additional info:
# oc get csr
NAME AGE SIGNERNAME REQUESTOR REQUESTEDDURATION CONDITION
csr-47mw2 6h25m kubernetes.io/kube-apiserver-client-kubelet system:node:ip-10-0-62-170.us-east-2.compute.internal <none> Pending
csr-48wvf 151m kubernetes.io/kube-apiserver-client-kubelet system:node:ip-10-0-58-12.us-east-2.compute.internal <none> Pending
csr-492bl 4h53m kubernetes.io/kube-apiserver-client-kubelet system:node:ip-10-0-62-170.us-east-2.compute.internal <none> Pending
csr-5fx9c 46m kubernetes.io/kube-apiserver-client-kubelet system:node:ip-10-0-62-170.us-east-2.compute.internal <none> Pending
csr-5sftr 28m kubernetes.io/kube-apiserver-client-kubelet system:node:ip-10-0-58-12.us-east-2.compute.internal <none> Pending
csr-5szpc 138m kubernetes.io/kube-apiserver-client-kubelet system:node:ip-10-0-62-170.us-east-2.compute.internal <none> Pending
csr-6n8bx 6h40m kubernetes.io/kube-apiserver-client-kubelet system:node:ip-10-0-62-170.us-east-2.compute.internal <none> Pending
csr-6t5jk 3h51m kubernetes.io/kube-apiserver-client-kubelet system:node:ip-10-0-62-170.us-east-2.compute.internal <none> Pending
csr-78fjh 3h18m kubernetes.io/kube-apiserver-client-kubelet system:node:ip-10-0-58-12.us-east-2.compute.internal <none> Pending
csr-8gnsc 58m kubernetes.io/kube-apiserver-client-kubelet system:node:ip-10-0-58-12.us-east-2.compute.internal <none> Pending
csr-8j7vb 74m kubernetes.io/kube-apiserver-client-kubelet system:node:ip-10-0-58-12.us-east-2.compute.internal <none> Pending
csr-8vffh 169m kubernetes.io/kube-apiserver-client-kubelet system:node:ip-10-0-62-170.us-east-2.compute.internal <none> Pending
csr-9hm5l 5h8m kubernetes.io/kube-apiserver-client-kubelet system:node:ip-10-0-62-170.us-east-2.compute.internal <none> Pending
csr-9l797 167m kubernetes.io/kube-apiserver-client-kubelet system:node:ip-10-0-58-12.us-east-2.compute.internal <none> Pending
csr-9tdkw 89m kubernetes.io/kube-apiserver-client-kubelet system:node:ip-10-0-58-12.us-east-2.compute.internal <none> Pending
csr-blg96 6h10m kubernetes.io/kube-apiserver-client-kubelet system:node:ip-10-0-62-170.us-east-2.compute.internal <none> Pending
csr-bpkss 6h38m kubernetes.io/kube-apiserver-client-kubelet system:node:ip-10-0-58-12.us-east-2.compute.internal <none> Pending
csr-btzxc 4h7m kubernetes.io/kube-apiserver-client-kubelet system:node:ip-10-0-62-170.us-east-2.compute.internal <none> Pending
csr-cjhqr 76m kubernetes.io/kube-apiserver-client-kubelet system:node:ip-10-0-62-170.us-east-2.compute.internal <none> Pending
csr-d68f7 43m kubernetes.io/kube-apiserver-client-kubelet system:node:ip-10-0-58-12.us-east-2.compute.internal <none> Pending
csr-dw6d6 4h35m kubernetes.io/kube-apiserver-client-kubelet system:node:ip-10-0-58-12.us-east-2.compute.internal <none> Pending
csr-fbd7r 6h37m kubernetes.io/kube-apiserver-client-kubelet system:node:ip-10-0-58-12.us-east-2.compute.internal <none> Pending
csr-fh4km 3h2m kubernetes.io/kube-apiserver-client-kubelet system:node:ip-10-0-58-12.us-east-2.compute.internal <none> Pending
csr-g5w56 5h6m kubernetes.io/kube-apiserver-client-kubelet system:node:ip-10-0-58-12.us-east-2.compute.internal <none> Pending
csr-gbz75 30m kubernetes.io/kube-apiserver-client-kubelet system:node:ip-10-0-62-170.us-east-2.compute.internal <none> Pending
csr-gjs2v 4h20m kubernetes.io/kube-apiserver-client-kubelet system:node:ip-10-0-58-12.us-east-2.compute.internal <none> Pending
csr-glggp 61m kubernetes.io/kube-apiserver-client-kubelet system:node:ip-10-0-62-170.us-east-2.compute.internal <none> Pending
csr-h64nk 15m kubernetes.io/kube-apiserver-client-kubelet system:node:ip-10-0-62-170.us-east-2.compute.internal <none> Pending
csr-jnq8l 3h20m kubernetes.io/kube-apiserver-client-kubelet system:node:ip-10-0-62-170.us-east-2.compute.internal <none> Pending
csr-jvbr6 4h50m kubernetes.io/kube-apiserver-client-kubelet system:node:ip-10-0-58-12.us-east-2.compute.internal <none> Pending
csr-k74ht 6h40m kubernetes.io/kube-apiserver-client-kubelet system:node:ip-10-0-62-170.us-east-2.compute.internal <none> Pending
csr-l5jr9 12m kubernetes.io/kube-apiserver-client-kubelet system:node:ip-10-0-58-12.us-east-2.compute.internal <none> Pending
csr-lc9cm 5h52m kubernetes.io/kube-apiserver-client-kubelet system:node:ip-10-0-58-12.us-east-2.compute.internal <none> Pending
csr-mvnhm 154m kubernetes.io/kube-apiserver-client-kubelet system:node:ip-10-0-62-170.us-east-2.compute.internal <none> Pending
csr-nsvhr 4h38m kubernetes.io/kube-apiserver-client-kubelet system:node:ip-10-0-62-170.us-east-2.compute.internal <none> Pending
csr-pgc8m 5h37m kubernetes.io/kube-apiserver-client-kubelet system:node:ip-10-0-58-12.us-east-2.compute.internal <none> Pending
csr-pxm7m 136m kubernetes.io/kube-apiserver-client-kubelet system:node:ip-10-0-58-12.us-east-2.compute.internal <none> Pending
csr-r5tbv 105m kubernetes.io/kube-apiserver-client-kubelet system:node:ip-10-0-58-12.us-east-2.compute.internal <none> Pending
csr-r5zss 120m kubernetes.io/kube-apiserver-client-kubelet system:node:ip-10-0-58-12.us-east-2.compute.internal <none> Pending
csr-r7lcz 6h22m kubernetes.io/kube-apiserver-client-kubelet system:node:ip-10-0-58-12.us-east-2.compute.internal <none> Pending
csr-rhh8n 4h22m kubernetes.io/kube-apiserver-client-kubelet system:node:ip-10-0-62-170.us-east-2.compute.internal <none> Pending
csr-rnw4c 3h36m kubernetes.io/kube-apiserver-client-kubelet system:node:ip-10-0-62-170.us-east-2.compute.internal <none> Pending
csr-rqlj8 107m kubernetes.io/kube-apiserver-client-kubelet system:node:ip-10-0-62-170.us-east-2.compute.internal <none> Pending
csr-rspvk 3h5m kubernetes.io/kube-apiserver-client-kubelet system:node:ip-10-0-62-170.us-east-2.compute.internal <none> Pending
csr-tqmb9 5h21m kubernetes.io/kube-apiserver-client-kubelet system:node:ip-10-0-58-12.us-east-2.compute.internal <none> Pending
csr-vc66v 5h39m kubernetes.io/kube-apiserver-client-kubelet system:node:ip-10-0-62-170.us-east-2.compute.internal <none> Pending
csr-vtxc7 92m kubernetes.io/kube-apiserver-client-kubelet system:node:ip-10-0-62-170.us-east-2.compute.internal <none> Pending
csr-wdk2p 5h24m kubernetes.io/kube-apiserver-client-kubelet system:node:ip-10-0-62-170.us-east-2.compute.internal <none> Pending
csr-ws4wb 123m kubernetes.io/kube-apiserver-client-kubelet system:node:ip-10-0-62-170.us-east-2.compute.internal <none> Pending
csr-wsnk5 3h49m kubernetes.io/kube-apiserver-client-kubelet system:node:ip-10-0-58-12.us-east-2.compute.internal <none> Pending
csr-x9vkx 5h55m kubernetes.io/kube-apiserver-client-kubelet system:node:ip-10-0-62-170.us-east-2.compute.internal <none> Pending
csr-xpgxj 6h7m kubernetes.io/kube-apiserver-client-kubelet system:node:ip-10-0-58-12.us-east-2.compute.internal <none> Pending
csr-zf479 3h33m kubernetes.io/kube-apiserver-client-kubelet system:node:ip-10-0-58-12.us-east-2.compute.internal <none> Pending
csr-zz4hc 4h4m kubernetes.io/kube-apiserver-client-kubelet system:node:ip-10-0-58-12.us-east-2.compute.internal <none> Pending
Must gather is available here https://drive.google.com/file/d/17gTjlws_S45LMOFSBkTdrFL66PTuEdnx/view?usp=sharing