-
Bug
-
Resolution: Done-Errata
-
Minor
-
4.13
-
Moderate
-
No
-
CLOUD Sprint 234, CLOUD Sprint 235, CLOUD Sprint 236
-
3
-
False
-
-
-
Bug Fix
-
Done
Description of problem:
CPMS create two replace machines when deleting a master machine on vSphere. Sorry, I have to revisit this https://issues.redhat.com/browse/OCPBUGS-4297 as I see all the related pr are merged, but I met twice on this template cluster ipi-on-vsphere/versioned-installer-vmc7-ovn-winc-thin_pvc-ci, once on ipi-on-vsphere/versioned-installer-vmc7-ovn template cluster today
Version-Release number of selected component (if applicable):
4.13.0-0.nightly-2023-02-13-235211
How reproducible:
Three times
Steps to Reproduce:
1. On this template cluster ipi-on-vsphere/versioned-installer-vmc7-ovn-winc-thin_pvc-ci, the first time I met this is after update all the 3 master machines using RollingUpdate strategy, then I delete a master machine. But seems the redundant machine was automatically deleted, because there was only one replacement machine when I revisit it. liuhuali@Lius-MacBook-Pro huali-test % oc get machine NAME PHASE TYPE REGION ZONE AGE huliu-vs15b-75tr7-master-djlxv-2 Running 47m huliu-vs15b-75tr7-master-h76sp-1 Running 58m huliu-vs15b-75tr7-master-wtzb7-0 Running 70m huliu-vs15b-75tr7-worker-gzsp9 Running 4h43m huliu-vs15b-75tr7-worker-vcqqh Running 4h43m winworker-4cltm Running 4h19m winworker-qd4c4 Running 4h19m liuhuali@Lius-MacBook-Pro huali-test % oc delete machine huliu-vs15b-75tr7-master-djlxv-2 machine.machine.openshift.io "huliu-vs15b-75tr7-master-djlxv-2" deleted ^C liuhuali@Lius-MacBook-Pro huali-test % oc get machine NAME PHASE TYPE REGION ZONE AGE huliu-vs15b-75tr7-master-bzd4h-2 Provisioning 34s huliu-vs15b-75tr7-master-djlxv-2 Deleting 48m huliu-vs15b-75tr7-master-gzhlk-2 Provisioning 35s huliu-vs15b-75tr7-master-h76sp-1 Running 59m huliu-vs15b-75tr7-master-wtzb7-0 Running 70m huliu-vs15b-75tr7-worker-gzsp9 Running 4h44m huliu-vs15b-75tr7-worker-vcqqh Running 4h44m winworker-4cltm Running 4h20m winworker-qd4c4 Running 4h20m liuhuali@Lius-MacBook-Pro huali-test % oc get machine NAME PHASE TYPE REGION ZONE AGE huliu-vs15b-75tr7-master-bzd4h-2 Running 38m huliu-vs15b-75tr7-master-h76sp-1 Running 97m huliu-vs15b-75tr7-master-wtzb7-0 Running 108m huliu-vs15b-75tr7-worker-gzsp9 Running 5h22m huliu-vs15b-75tr7-worker-vcqqh Running 5h22m winworker-4cltm Running 4h57m winworker-qd4c4 Running 4h57m 2.Then I change the strategy to OnDelete, and after update all the 3 master machines using OnDelete strategy, then I delete a master machine. liuhuali@Lius-MacBook-Pro huali-test % oc get machine NAME PHASE TYPE REGION ZONE AGE huliu-vs15b-75tr7-master-hzhgq-0 Running 137m huliu-vs15b-75tr7-master-kj9zf-2 Running 89m huliu-vs15b-75tr7-master-kz6cx-1 Running 59m huliu-vs15b-75tr7-worker-gzsp9 Running 7h46m huliu-vs15b-75tr7-worker-vcqqh Running 7h46m winworker-4cltm Running 7h21m winworker-qd4c4 Running 7h21m liuhuali@Lius-MacBook-Pro huali-test % oc delete machine huliu-vs15b-75tr7-master-hzhgq-0 machine.machine.openshift.io "huliu-vs15b-75tr7-master-hzhgq-0" deleted ^C liuhuali@Lius-MacBook-Pro huali-test % oc get machine NAME PHASE TYPE REGION ZONE AGE huliu-vs15b-75tr7-master-hzhgq-0 Deleting 138m huliu-vs15b-75tr7-master-kb687-0 Provisioning 26s huliu-vs15b-75tr7-master-kj9zf-2 Running 90m huliu-vs15b-75tr7-master-kz6cx-1 Running 60m huliu-vs15b-75tr7-master-qn6kq-0 Provisioning 26s huliu-vs15b-75tr7-worker-gzsp9 Running 7h47m huliu-vs15b-75tr7-worker-vcqqh Running 7h47m winworker-4cltm Running 7h22m winworker-qd4c4 Running 7h22m liuhuali@Lius-MacBook-Pro huali-test % oc get machine NAME PHASE TYPE REGION ZONE AGE huliu-vs15b-75tr7-master-kb687-0 Running 154m huliu-vs15b-75tr7-master-kj9zf-2 Running 4h5m huliu-vs15b-75tr7-master-kz6cx-1 Running 3h34m huliu-vs15b-75tr7-master-qn6kq-0 Running 154m huliu-vs15b-75tr7-worker-gzsp9 Running 10h huliu-vs15b-75tr7-worker-vcqqh Running 10h winworker-4cltm Running 9h winworker-qd4c4 Running 9h liuhuali@Lius-MacBook-Pro huali-test % oc get co NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE authentication 4.13.0-0.nightly-2023-02-13-235211 True False False 5h13m baremetal 4.13.0-0.nightly-2023-02-13-235211 True False False 10h cloud-controller-manager 4.13.0-0.nightly-2023-02-13-235211 True False False 10h cloud-credential 4.13.0-0.nightly-2023-02-13-235211 True False False 10h cluster-autoscaler 4.13.0-0.nightly-2023-02-13-235211 True False False 10h config-operator 4.13.0-0.nightly-2023-02-13-235211 True False False 10h console 4.13.0-0.nightly-2023-02-13-235211 True False False 145m control-plane-machine-set 4.13.0-0.nightly-2023-02-13-235211 True False True 10h Observed 1 updated machine(s) in excess for index 0 csi-snapshot-controller 4.13.0-0.nightly-2023-02-13-235211 True False False 10h dns 4.13.0-0.nightly-2023-02-13-235211 True False False 10h etcd 4.13.0-0.nightly-2023-02-13-235211 True False False 10h image-registry 4.13.0-0.nightly-2023-02-13-235211 True False False 9h ingress 4.13.0-0.nightly-2023-02-13-235211 True False False 10h insights 4.13.0-0.nightly-2023-02-13-235211 True False False 10h kube-apiserver 4.13.0-0.nightly-2023-02-13-235211 True False False 10h kube-controller-manager 4.13.0-0.nightly-2023-02-13-235211 True False False 10h kube-scheduler 4.13.0-0.nightly-2023-02-13-235211 True False False 10h kube-storage-version-migrator 4.13.0-0.nightly-2023-02-13-235211 True False False 6h18m machine-api 4.13.0-0.nightly-2023-02-13-235211 True False False 10h machine-approver 4.13.0-0.nightly-2023-02-13-235211 True False False 10h machine-config 4.13.0-0.nightly-2023-02-13-235211 True False False 3h59m marketplace 4.13.0-0.nightly-2023-02-13-235211 True False False 10h monitoring 4.13.0-0.nightly-2023-02-13-235211 True False False 10h network 4.13.0-0.nightly-2023-02-13-235211 True False False 10h node-tuning 4.13.0-0.nightly-2023-02-13-235211 True False False 10h openshift-apiserver 4.13.0-0.nightly-2023-02-13-235211 True False False 145m openshift-controller-manager 4.13.0-0.nightly-2023-02-13-235211 True False False 10h openshift-samples 4.13.0-0.nightly-2023-02-13-235211 True False False 10h operator-lifecycle-manager 4.13.0-0.nightly-2023-02-13-235211 True False False 10h operator-lifecycle-manager-catalog 4.13.0-0.nightly-2023-02-13-235211 True False False 10h operator-lifecycle-manager-packageserver 4.13.0-0.nightly-2023-02-13-235211 True False False 6h7m service-ca 4.13.0-0.nightly-2023-02-13-235211 True False False 10h storage 4.13.0-0.nightly-2023-02-13-235211 True False False 3h57m liuhuali@Lius-MacBook-Pro huali-test % 3.On ipi-on-vsphere/versioned-installer-vmc7-ovn template cluster, after update all the 3 master machines using RollingUpdate strategy, no issue, then delete a master machine, no issue, then change the strategy to OnDelete, and replace the master machines one by one, when I delete the last one, two replace machines created. liuhuali@Lius-MacBook-Pro huali-test % oc get co NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE authentication 4.13.0-0.nightly-2023-02-13-235211 True False False 73m baremetal 4.13.0-0.nightly-2023-02-13-235211 True False False 9h cloud-controller-manager 4.13.0-0.nightly-2023-02-13-235211 True False False 9h cloud-credential 4.13.0-0.nightly-2023-02-13-235211 True False False 9h cluster-autoscaler 4.13.0-0.nightly-2023-02-13-235211 True False False 9h config-operator 4.13.0-0.nightly-2023-02-13-235211 True False False 9h console 4.13.0-0.nightly-2023-02-13-235211 True False False 129m control-plane-machine-set 4.13.0-0.nightly-2023-02-13-235211 True True False 9h Observed 1 replica(s) in need of update csi-snapshot-controller 4.13.0-0.nightly-2023-02-13-235211 True False False 9h dns 4.13.0-0.nightly-2023-02-13-235211 True False False 9h etcd 4.13.0-0.nightly-2023-02-13-235211 True False False 9h image-registry 4.13.0-0.nightly-2023-02-13-235211 True False False 8h ingress 4.13.0-0.nightly-2023-02-13-235211 True False False 8h insights 4.13.0-0.nightly-2023-02-13-235211 True False False 8h kube-apiserver 4.13.0-0.nightly-2023-02-13-235211 True False False 9h kube-controller-manager 4.13.0-0.nightly-2023-02-13-235211 True False False 9h kube-scheduler 4.13.0-0.nightly-2023-02-13-235211 True False False 9h kube-storage-version-migrator 4.13.0-0.nightly-2023-02-13-235211 True False False 3h22m machine-api 4.13.0-0.nightly-2023-02-13-235211 True False False 9h machine-approver 4.13.0-0.nightly-2023-02-13-235211 True False False 9h machine-config 4.13.0-0.nightly-2023-02-13-235211 True False False 9h marketplace 4.13.0-0.nightly-2023-02-13-235211 True False False 9h monitoring 4.13.0-0.nightly-2023-02-13-235211 True False False 8h network 4.13.0-0.nightly-2023-02-13-235211 True False False 9h node-tuning 4.13.0-0.nightly-2023-02-13-235211 True False False 9h openshift-apiserver 4.13.0-0.nightly-2023-02-13-235211 True False False 9h openshift-controller-manager 4.13.0-0.nightly-2023-02-13-235211 True False False 9h openshift-samples 4.13.0-0.nightly-2023-02-13-235211 True False False 9h operator-lifecycle-manager 4.13.0-0.nightly-2023-02-13-235211 True False False 9h operator-lifecycle-manager-catalog 4.13.0-0.nightly-2023-02-13-235211 True False False 9h operator-lifecycle-manager-packageserver 4.13.0-0.nightly-2023-02-13-235211 True False False 46m service-ca 4.13.0-0.nightly-2023-02-13-235211 True False False 9h storage 4.13.0-0.nightly-2023-02-13-235211 True False False 77m liuhuali@Lius-MacBook-Pro huali-test % oc get machine NAME PHASE TYPE REGION ZONE AGE huliu-vs15a-kjm6h-master-55s4l-1 Running 84m huliu-vs15a-kjm6h-master-ppc55-2 Running 3h4m huliu-vs15a-kjm6h-master-rqb52-0 Running 53m huliu-vs15a-kjm6h-worker-6nbz7 Running 9h huliu-vs15a-kjm6h-worker-g84xg Running 9h liuhuali@Lius-MacBook-Pro huali-test % oc delete machine huliu-vs15a-kjm6h-master-ppc55-2 machine.machine.openshift.io "huliu-vs15a-kjm6h-master-ppc55-2" deleted ^C liuhuali@Lius-MacBook-Pro huali-test % oc get machine NAME PHASE TYPE REGION ZONE AGE huliu-vs15a-kjm6h-master-55s4l-1 Running 85m huliu-vs15a-kjm6h-master-cvwzz-2 Provisioning 27s huliu-vs15a-kjm6h-master-ppc55-2 Deleting 3h5m huliu-vs15a-kjm6h-master-qp9m5-2 Provisioning 27s huliu-vs15a-kjm6h-master-rqb52-0 Running 54m huliu-vs15a-kjm6h-worker-6nbz7 Running 9h huliu-vs15a-kjm6h-worker-g84xg Running 9h liuhuali@Lius-MacBook-Pro huali-test % oc get machine NAME PHASE TYPE REGION ZONE AGE huliu-vs15a-kjm6h-master-55s4l-1 Running 163m huliu-vs15a-kjm6h-master-cvwzz-2 Running 79m huliu-vs15a-kjm6h-master-qp9m5-2 Running 79m huliu-vs15a-kjm6h-master-rqb52-0 Running 133m huliu-vs15a-kjm6h-worker-6nbz7 Running 10h huliu-vs15a-kjm6h-worker-g84xg Running 10h liuhuali@Lius-MacBook-Pro huali-test %
Actual results:
CPMS create two replace machines when deleting a master machine, and the two replace machines exist there for a long time
Expected results:
CPMS should only create one replace machine when deleting a master machine, or quickly delete the redundant machine
Additional info:
Must-gather: https://drive.google.com/file/d/1aCyFn9okNxRz7nE3Yt_8g6Kx7sPSGCg2/view?usp=sharing for ipi-on-vsphere/versioned-installer-vmc7-ovn-winc-thin_pvc-ci template cluster https://drive.google.com/file/d/1i0fWSP0-HqfdV5E0wcNevognLUQKecvl/view?usp=sharing for ipi-on-vsphere/versioned-installer-vmc7-ovn template cluster
- blocks
-
OCPBUGS-13888 CPMS create two replace machines when deleting a master machine on vSphere
- Closed
- is cloned by
-
OCPBUGS-13888 CPMS create two replace machines when deleting a master machine on vSphere
- Closed
- links to
-
RHSA-2023:5006 OpenShift Container Platform 4.14.z security update