-
Bug
-
Resolution: Done-Errata
-
Normal
-
4.13
-
None
-
No
-
SDN Sprint 234, SDN Sprint 235
-
2
-
Rejected
-
False
-
-
N/A
-
Release Note Not Required
Description of problem:
EgressIP was NOT migrated to correct workers after deleting machine it was assigned in GCP XPN cluster.
Version-Release number of selected component (if applicable):
4.13.0-0.nightly-2023-03-29-235439
How reproducible:
Always
Steps to Reproduce:
1. Set up GCP XPN cluster. 2. Scale two new worker nodes % oc scale --replicas=2 machineset huirwang-0331a-m4mws-worker-c -n openshift-machine-api machineset.machine.openshift.io/huirwang-0331a-m4mws-worker-c scaled 3. Wait the two new workers node ready. % oc get machineset -n openshift-machine-api NAME DESIRED CURRENT READY AVAILABLE AGE huirwang-0331a-m4mws-worker-a 1 1 1 1 86m huirwang-0331a-m4mws-worker-b 1 1 1 1 86m huirwang-0331a-m4mws-worker-c 2 2 2 2 86m huirwang-0331a-m4mws-worker-f 0 0 86m % oc get nodes NAME STATUS ROLES AGE VERSION huirwang-0331a-m4mws-master-0.c.openshift-qe.internal Ready control-plane,master 82m v1.26.2+dc93b13 huirwang-0331a-m4mws-master-1.c.openshift-qe.internal Ready control-plane,master 82m v1.26.2+dc93b13 huirwang-0331a-m4mws-master-2.c.openshift-qe.internal Ready control-plane,master 82m v1.26.2+dc93b13 huirwang-0331a-m4mws-worker-a-hfqsn.c.openshift-qe.internal Ready worker 71m v1.26.2+dc93b13 huirwang-0331a-m4mws-worker-b-vbqf2.c.openshift-qe.internal Ready worker 71m v1.26.2+dc93b13 huirwang-0331a-m4mws-worker-c-rhbkr.c.openshift-qe.internal Ready worker 8m22s v1.26.2+dc93b13 huirwang-0331a-m4mws-worker-c-wnm4r.c.openshift-qe.internal Ready worker 8m22s v1.26.2+dc93b13 3. Label one new worker node as egress node % oc label node huirwang-0331a-m4mws-worker-c-rhbkr.c.openshift-qe.internal k8s.ovn.org/egress-assignable="" node/huirwang-0331a-m4mws-worker-c-rhbkr.c.openshift-qe.internal labeled 4. Create egressIP object oc get egressIP NAME EGRESSIPS ASSIGNED NODE ASSIGNED EGRESSIPS egressip-1 10.0.32.100 huirwang-0331a-m4mws-worker-c-rhbkr.c.openshift-qe.internal 10.0.32.100 5. Label second new worker node as egress node % oc label node huirwang-0331a-m4mws-worker-c-wnm4r.c.openshift-qe.internal k8s.ovn.org/egress-assignable="" node/huirwang-0331a-m4mws-worker-c-wnm4r.c.openshift-qe.internal labeled 6. Delete the assigned egress node % oc delete machines.machine.openshift.io huirwang-0331a-m4mws-worker-c-rhbkr -n openshift-machine-api machine.machine.openshift.io "huirwang-0331a-m4mws-worker-c-rhbkr" deleted % oc get nodes NAME STATUS ROLES AGE VERSION huirwang-0331a-m4mws-master-0.c.openshift-qe.internal Ready control-plane,master 87m v1.26.2+dc93b13 huirwang-0331a-m4mws-master-1.c.openshift-qe.internal Ready control-plane,master 86m v1.26.2+dc93b13 huirwang-0331a-m4mws-master-2.c.openshift-qe.internal Ready control-plane,master 87m v1.26.2+dc93b13 huirwang-0331a-m4mws-worker-a-hfqsn.c.openshift-qe.internal Ready worker 76m v1.26.2+dc93b13 huirwang-0331a-m4mws-worker-b-vbqf2.c.openshift-qe.internal Ready worker 76m v1.26.2+dc93b13 huirwang-0331a-m4mws-worker-c-wnm4r.c.openshift-qe.internal Ready worker 13m v1.26.2+dc93b13 29468 W0331 02:48:34.917391 1 egressip_healthcheck.go:162] Could not connect to huirwang-0331a-m4mws-worker-c-rhbkr.c.openshift-qe.internal (10.129.4.2:9107): context deadline exceeded 29469 W0331 02:48:34.917417 1 default_network_controller.go:903] Node: huirwang-0331a-m4mws-worker-c-rhbkr.c.openshift-qe.internal is not ready, deleting it from egre ss assignment 29470 I0331 02:48:34.917590 1 client.go:783] "msg"="transacting operations" "database"="OVN_Northbound" "operations"="[{Op:update Table:Logical_Switch_Port Row:map[o ptions:{GoMap:map[router-port:rtoe-GR_huirwang-0331a-m4mws-worker-c-rhbkr.c.openshift-qe.internal]}] Rows:[] Columns:[] Mutations:[] Timeout:<nil> Where:[where column _uuid == {6efd3c58-9458-44a2-a43b-e70e669efa72}] Until: Durable:<nil> Comment:<nil> Lock:<nil> UUIDName:}]" 29471 E0331 02:48:34.920766 1 egressip.go:993] Allocator error: EgressIP: egressip-1 assigned to node: huirwang-0331a-m4mws-worker-c-rhbkr.c.openshift-qe.internal whi ch is not reachable, will attempt rebalancing 29472 E0331 02:48:34.920789 1 egressip.go:997] Allocator error: EgressIP: egressip-1 assigned to node: huirwang-0331a-m4mws-worker-c-rhbkr.c.openshift-qe.internal whi ch is not ready, will attempt rebalancing 29473 I0331 02:48:34.920808 1 egressip.go:1212] Deleting pod egress IP status: {huirwang-0331a-m4mws-worker-c-rhbkr.c.openshift-qe.internal 10.0.32.100} for EgressIP: egressip-1
Actual results:
The egressIP was not migrated to correct worker oc get egressIP NAME EGRESSIPS ASSIGNED NODE ASSIGNED EGRESSIPS egressip-1 10.0.32.100 huirwang-0331a-m4mws-worker-c-rhbkr.c.openshift-qe.internal 10.0.32.100
Expected results:
The egressIP should migrated to correct worker from deleted node.
Additional info:
- blocks
-
OCPBUGS-13127 EgressIP was NOT migrated to correct workers after deleting machine it was assigned in GCP XPN cluster.
- Closed
- is cloned by
-
OCPBUGS-13127 EgressIP was NOT migrated to correct workers after deleting machine it was assigned in GCP XPN cluster.
- Closed
- is duplicated by
-
OCPBUGS-11803 If first egressIP node is deleted, egressIP does not failover to the second available egressIP node
- Closed
- links to
-
RHEA-2023:5006 rpm