-
Bug
-
Resolution: Unresolved
-
Undefined
-
None
-
4.18.z
-
None
-
False
-
-
None
-
Critical
-
None
-
Unspecified
-
Production
-
None
-
None
-
None
-
None
-
Customer Escalated
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
While upgrading the cluster to 4.18 from 4.17, after the cluster network operator finishes upgrading to the 4.18 image, VMs begin to lose network connectivity through their ovn-k8s-cni-overlay localnet NADs.
Restarting the ovnkube-node pod seems to resolve the issue, as does performing a VM live migration for the impacted VM. No OVN DB rebuild was tested as restarting the ovnkube-node pod works.
Running
ovn-nbctl list logical-switch-port
shows that the impacted VM does not have the logical switch port for connectivity.
This has currently only been brought up for VM pods to my knowledge, I haven't heard it happen with non-VM pods using localnet NADs.
Version-Release number of selected component (if applicable):
4.18.z
Currently has been seen on 4.18.27 and 4.18.28 specifically but is likely wider.
How reproducible:
Currently unsure, has proved difficult so far but I am currently working on a potential reproduction.
Steps to Reproduce:
1. Create a 4.17.z cluster
2. Install and configure OpenShift Virt
3. Configure a ovn-k8s-cni-overlay localnet NAD
4. Create a VM using the NAD configured in step 3
6. Upgrade to 4.18.z
Actual results:
After the cluster network operator upgrades, connectivity is lost to VMs over their localnet NADs.
Expected results:
During and after upgrade connectivity remains for VMs over their localnet NADs.
Additional info:
Details on specific testing will be commented.
Affected Platforms:
OpenShift Container Platform 4.18
- links to