-
Bug
-
Resolution: Unresolved
-
Major
-
None
-
4.18.z
-
None
-
False
-
-
None
-
Critical
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
IHAC who reported an issue with Metallb in OpenShift v4.18. In the past, the same issue has been reported in the upstream version 0.14.9, which eventually was resolved by upgrading to verison 0.15.2. But now in the metallb-operator version metallb-operator.v4.18.0-202601201947, we seem to be seeing the same issue.
- The change in PR 2470 [1] causes a node to stop announcing its address if the node is cordoned. For us this is a landmine waiting to catch us by surprise and cause an outage.
- That change was reverted in PR 2715 [2] in version 0.15.0
[1] https://github.com/metallb/metallb/pull/2470
[2] https://github.com/metallb/metallb/pull/2715
The customer also confirmed that they do have an OpenShift v4.19 cluster in which we have metallb-operator.v4.19.0-202601120612, and we don't see the misbehavior there. We can cordon a node, and it continues to announce addresses.
Version-Release number of selected component (if applicable):
metallb-operator version metallb-operator.v4.18.0-202601201947
How reproducible:
Always
Steps to Reproduce:
1. Configure Metallb to announce an IP address in L2 mode
2. Cordon the node which is announcing the IP address currently
3. Observe if the node still continues to announce the address
Actual results:
Node stop announcing its address if the node is cordoned.
Expected results:
After we cordon a node, it should continue to announce addresses
Additional info:
I also opened a Slack thread[3] with engineering, and as discussed, I am reporting this bug, so that engineering can work on the backport.
[3] https://redhat-internal.slack.com/archives/C01EH16NFPZ/p1771928354721749