-
Bug
-
Resolution: Done-Errata
-
Undefined
-
4.16.0
Description of problem:
We are in a live migration scenario.
If a project has a networkpolicy to allow from the host network (more concretely, to allow from the ingress controllers and the ingress controllers are in the host network), traffic doesn't work during the live migration between any ingress controller node (either migrated or not migrated) and an already migrated application node.
I'll expand later in the description and internal comments, but the TL;DR is that the IPs of the tun0 of not migrated source nodes and the IPs of the ovn-k8s-mp0 from migrated source nodes are not added to the address sets related to the networkpolicy ACL in the target OVN-Kubernetes node, so that traffic is not allowed.
Version-Release number of selected component (if applicable):
4.16.13
How reproducible:
Always
Steps to Reproduce:
1. Before the migration: have a project with a networkpolicy that allows from the ingress controller and the ingress controller in the host network. Everything must work properly at this point.
2. Start the migration
3. During the migration, check connectivity from the host network of either a migrated node or a non-migrated node. Both will fail (checking from the same node doesn't fail)
Actual results:
Pod on the worker node is not reachable from the host network of the ingress controller node (unless the pod is in the same node than the ingress controller), which causes the ingress controller routes to throw 503 error.
Expected results:
Pod on the worker node to be reachable from the ingress controller node, even when the ingress controller node has not migrated yet and the application node has.
Additional info:
This is not a duplicate of OCPBUGS-42578. This bug refers to the host-to-pod communication path while the other one doesn't.
This is a customer issue. More details to be included in private comments for privacy.
Workaround: Creating a networkpolicy that explicitly allows traffic from tun0 and ovn-k8s-mp0 interfaces. However, note that the workaround can be problematic for clusters with hundreds or thousands of projects. Another possible workaround is to temporarily delete all the networkpolicies of the projects. But again, this may be problematic (and a security risk).
- blocks
-
OCPBUGS-43605 Allow from host network networkpolicies do not work during live migration
- POST
-
OCPBUGS-43769 Allow from host network networkpolicies do not work during live migration
- Closed
- clones
-
OCPBUGS-43343 Allow from host network networkpolicies do not work during live migration
- Closed
- depends on
-
OCPBUGS-43343 Allow from host network networkpolicies do not work during live migration
- Closed
- is cloned by
-
OCPBUGS-43769 Allow from host network networkpolicies do not work during live migration
- Closed
- is duplicated by
-
OCPBUGS-43604 Allow from host network networkpolicies do not work during live migration
- Closed
- links to
-
RHBA-2024:9615 OpenShift Container Platform 4.16.z bug fix update