-
Bug
-
Resolution: Won't Do
-
Major
-
None
-
4.17
-
Quality / Stability / Reliability
-
False
-
-
None
-
Important
-
No
-
None
-
None
-
Rejected
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
The issue was found when verifying bug https://issues.redhat.com/browse/OCPBUGS-38653, as it's not same as original issue, open a new bug to track.
Version-Release number of selected component (if applicable):
The build from openshift/ovn-kubernetes#2265
How reproducible:
Steps to Reproduce:
oc get nodes
NAME STATUS ROLES AGE VERSION
huirwang-08215-tkrrg-master-0 Ready control-plane,master 154m v1.30.3
huirwang-08215-tkrrg-master-1 Ready control-plane,master 154m v1.30.3
huirwang-08215-tkrrg-master-2 Ready control-plane,master 154m v1.30.3
huirwang-08215-tkrrg-worker-a-czr4g Ready worker 144m v1.30.3
huirwang-08215-tkrrg-worker-b-hsgxf Ready worker 61s v1.30.3
huirwang-08215-tkrrg-worker-b-xd7pv Ready worker 3m10s v1.30.3
huirwang-08215-tkrrg-worker-c-7lmrf Ready worker 143m v1.30.3
huirwang-08215-tkrrg-worker-f-sgskm Ready worker 27m v1.30.3
Apply egress label to node huirwang-08215-tkrrg-worker-b-hsgxf, huirwang-08215-tkrrg-worker-f-sgskm, huirwang-08215-tkrrg-worker-c-7lmrf
Create egressIP object
% oc get egressip -o yaml
apiVersion: v1
items:
- apiVersion: k8s.ovn.org/v1
kind: EgressIP
metadata:
creationTimestamp: "2024-08-21T06:26:29Z"
generation: 3
name: egressip-2
resourceVersion: "99711"
uid: 082100dc-1012-47e5-95c1-e0aa4faf97d9
spec:
egressIPs:
- 10.0.128.101
- 10.0.128.100
namespaceSelector:
matchLabels:
name: qe
status:
items:
- egressIP: 10.0.128.100
node: huirwang-08215-tkrrg-worker-f-sgskm
- egressIP: 10.0.128.101
node: huirwang-08215-tkrrg-worker-c-7lmrf
kind: List
metadata:
resourceVersion: ""
Create namespace test and pods in test namespace, apply label name=qe to namespace test
% oc get pods -n test -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
test-rc-547v8 1/1 Running 0 17m 10.130.2.9 huirwang-08215-tkrrg-worker-f-sgskm <none> <none>
test-rc-x9hd4 1/1 Running 0 17m 10.131.0.30 huirwang-08215-tkrrg-worker-a-czr4g <none> <none>
% oc -n openshift-ovn-kubernetes exec ${ovn_pod} -c northd -- ovn-nbctl find logical_router_policy match="\"ip4.src == ${podip}\""
_uuid : 8a52ad1c-ffe6-4039-b4c4-59221fb9da98
action : reroute
bfd_sessions : []
external_ids : {name=egressip-2}
match : "ip4.src == 10.131.0.30"
nexthop : []
nexthops : ["100.88.0.7", "100.88.0.8"]
options : {}
priority : 100
% echo $LSP_ADDRESSES
(tstor-huirwang-08215-tkrrg-master-0) 0a:58:64:58:00:02 100.88.0.2/16
(tstor-huirwang-08215-tkrrg-master-1) 0a:58:64:58:00:03 100.88.0.3/16
(tstor-huirwang-08215-tkrrg-master-2) 0a:58:64:58:00:04 100.88.0.4/16
(tstor-huirwang-08215-tkrrg-worker-a-czr4g) 0a:58:64:58:00:05 100.88.0.5/16
(tstor-huirwang-08215-tkrrg-worker-b-hsgxf) 0a:58:64:58:00:0a 100.88.0.10/16
(tstor-huirwang-08215-tkrrg-worker-b-xd7pv) 0a:58:64:58:00:09 100.88.0.9/16
(tstor-huirwang-08215-tkrrg-worker-c-7lmrf) 0a:58:64:58:00:07 100.88.0.7/16
(tstor-huirwang-08215-tkrrg-worker-f-sgskm) 0a:58:64:58:00:08 100.88.0.8/16
Delete one egress node
% oc delete node huirwang-08215-tkrrg-worker-c-7lmrf
node "huirwang-08215-tkrrg-worker-c-7lmrf" deleted
Result:
% oc get egressip -o yaml
apiVersion: v1
items:
- apiVersion: k8s.ovn.org/v1
kind: EgressIP
metadata:
creationTimestamp: "2024-08-21T06:26:29Z"
generation: 4
name: egressip-2
resourceVersion: "100274"
uid: 082100dc-1012-47e5-95c1-e0aa4faf97d9
spec:
egressIPs:
- 10.0.128.101
- 10.0.128.100
namespaceSelector:
matchLabels:
name: qe
status:
items:
- egressIP: 10.0.128.100
node: huirwang-08215-tkrrg-worker-f-sgskm
kind: List
metadata:
resourceVersion: ""
% oc -n openshift-ovn-kubernetes exec ${ovn_pod} -c northd -- ovn-nbctl find logical_router_policy match="\"ip4.src == ${podip}\""
_uuid : 8a52ad1c-ffe6-4039-b4c4-59221fb9da98
action : reroute
bfd_sessions : []
external_ids : {name=egressip-2}
match : "ip4.src == 10.131.0.30"
nexthop : []
nexthops : ["100.88.0.8"]
options : {}
priority : 100
The egressIP didn't failover to another available egress node huirwang-08215-tkrrg-worker-b-hsgxf, we can see the egress label was applied on this node.
% oc get node huirwang-08215-tkrrg-worker-b-hsgxf --show-labels
NAME STATUS ROLES AGE VERSION LABELS
huirwang-08215-tkrrg-worker-b-hsgxf Ready worker 28m v1.30.3 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=n2-standard-4,beta.kubernetes.io/os=linux,failure-domain.beta.kubernetes.io/region=us-central1,failure-domain.beta.kubernetes.io/zone=us-central1-b,k8s.ovn.org/egress-assignable=,kubernetes.io/arch=amd64,kubernetes.io/hostname=huirwang-08215-tkrrg-worker-b-hsgxf,kubernetes.io/os=linux,node-role.kubernetes.io/worker=,node.kubernetes.io/instance-type=n2-standard-4,node.openshift.io/os_id=rhcos,topology.gke.io/zone=us-central1-b,topology.kubernetes.io/region=us-central1,topology.kubernetes.io/zone=us-central1-b
Actual results:
EgressIP was not failover to another egress node
oc get CloudPrivateIPConfig 10.0.128.101 -o yaml
apiVersion: cloud.network.openshift.io/v1
kind: CloudPrivateIPConfig
metadata:
annotations:
k8s.ovn.org/egressip-owner-ref: egressip-2
creationTimestamp: "2024-08-21T06:26:29Z"
finalizers:
- cloudprivateipconfig.cloud.network.openshift.io/finalizer
generation: 2
name: 10.0.128.101
resourceVersion: "103371"
uid: 357a3e26-254d-40af-a4b2-c95f9e2b7cee
spec:
node: huirwang-08215-tkrrg-worker-b-hsgxf
status:
conditions:
- lastTransitionTime: "2024-08-21T06:33:44Z"
message: 'Error processing cloud assignment request, err: {"errors":[{"code":"IP_IN_USE_BY_ANOTHER_RESOURCE","message":"IP
''10.0.128.101/32'' is already being used by another resource. "}]}'
observedGeneration: 2
reason: CloudResponseError
status: "False"
type: Assigned
node: huirwang-08215-tkrrg-worker-b-hsgxf
Expected results:
Should be able to failover to egress node
Additional info: