-
Bug
-
Resolution: Done-Errata
-
Undefined
-
None
-
4.12.0
-
+
-
Important
-
No
-
CNF Network Sprint 237, CNF Network Sprint 239
-
2
-
False
-
-
Customer Escalated
-
-
7/13: u/s PR merged as well as d/s in 4.14
-
Description of problem:
When a new MachineConfig is applied on an MCP, for the nodes which are drained and rebooted, the VFs are not initialized until the entire pool is updated. The sriov-network-config-daemon waits for the MCP to be ready before draining and creating the VFs.
Below are the events where an MC was applied on a worker MCP which has 6 nodes:
New MC getting applied on worker-2:
I0315 08:52:34.317552 1 node_controller.go:436] Pool worker: Setting node worker-2.ocp4.shiftvirt.com target to rendered-worker-d1a8861c0fa4c9d7be2db1af7125b3ba I0315 08:52:34.412815 1 node_controller.go:446] Pool worker: node worker-2.ocp4.shiftvirt.com: changed annotation machineconfiguration.openshift.io/desiredConfig = rendered-worker-d1a8861c0fa4c9d7be2db1af7125b3ba I0315 08:52:44.585116 1 drain_controller.go:139] node worker-2.ocp4.shiftvirt.com: initiating cordon (currently schedulable: true) I0315 08:52:44.809751 1 drain_controller.go:139] node worker-2.ocp4.shiftvirt.com: cordon succeeded (currently schedulable: false) I0315 08:52:44.809795 1 drain_controller.go:139] node worker-2.ocp4.shiftvirt.com: initiating drain
Completed at 09:00:
I0315 09:00:50.280547 1 drain_controller.go:139] node worker-2.ocp4.shiftvirt.com: uncordon succeeded (currently schedulable: true)
I0315 09:00:50.280571 1 drain_controller.go:139] node worker-2.ocp4.shiftvirt.com: operation successful; applying completion annotation
I0315 09:00:53.699594 1 node_controller.go:446] Pool worker: node worker-2.ocp4.shiftvirt.com: Completed update to rendered-worker-d1a8861c0fa4c9d7be2db1af7125b3ba
Started the sriov-network-config-daemon and the generic-plugin requested node drain:
I0315 09:00:42.486107 3704 utils.go:249] NeedUpdate(): NumVfs needs update desired=5, current=0 I0315 09:00:42.486120 3704 generic_plugin.go:172] generic-plugin needDrainNode(): need drain, PF 0000:19:00.1 request update I0315 09:00:42.486131 3704 generic_plugin.go:125] generic-plugin tryEnableIommuInKernelArgs() I0315 09:00:42.510279 3704 daemon.go:478] nodeStateSyncHandler(): plugin generic_plugin: reqDrain true, reqReboot false
Tried to pause MCP, but the pool is still in updating status, it is updating the other nodes:
I0315 09:00:46.788534 3704 daemon.go:868] pauseMCP():MCP worker is not ready: [{RenderDegraded False 2023-03-14 13:22:58 +0000 UTC } {NodeDegraded False 2023-03-15 07:14:42 +0000 UTC } {Degraded False 2023-03-15 07:14:42 +0000 UTC } {Updated False 2023-03-15 08:46:34 +0000 UTC } {Updating True 2023-03-15 08:46:34 +0000 UTC All nodes are updating to rendered-worker-d1a8861c0fa4c9d7be2db1af7125b3ba}], wait...
This check is at https://github.com/k8snetworkplumbingwg/sriov-network-operator/blob/f53e1d5f75f8fe2336aec82ea47ed054773810dc/pkg/daemon/daemon.go#L826
The next node is being updated:
I0315 09:01:05.504487 1 drain_controller.go:139] node work0.ocp4.shiftvirt.com: cordoning I0315 09:01:05.504515 1 drain_controller.go:139] node work0.ocp4.shiftvirt.com: initiating cordon (currently schedulable: true) I0315 09:01:05.562782 1 drain_controller.go:139] node work0.ocp4.shiftvirt.com: cordon succeeded (currently schedulable: false)
MCP update was succesfull at 09:08:
I0315 09:08:00.807071 1 node_controller.go:446] Pool worker: node work0.ocp4.shiftvirt.com: Completed update to rendered-worker-d1a8861c0fa4c9d7be2db1af7125b3ba I0315 09:08:00.825989 1 status.go:90] Pool worker: All nodes are updated with rendered-worker-d1a8861c0fa4c9d7be2db1af7125b3ba
It was not able to pause MCP until all nodes are updated and MCP is ready:
I0315 09:07:46.765462 3704 daemon.go:868] pauseMCP():MCP worker is not ready: [{RenderDegraded False 2023-03-14 13:22:58 +0000 UTC } {NodeDegraded False 2023-03-15 07:14:42 +0000 UTC } {Degraded False 2023-03-15 07:14:42 +0000 UTC } {Updated False 2023-03-15 08:46:34 +0000 UTC } {Updating True 2023-03-15 08:46:34 +0000 UTC All nodes are updating to rendered-worker-d1a8861c0fa4c9d7be2db1af7125b3ba}], wait... I0315 09:08:00.905893 3704 daemon.go:828] pauseMCP(): MCP worker is ready I0315 09:08:00.905943 3704 daemon.go:838] pauseMCP(): pause MCP worker I0315 09:08:00.935898 3704 daemon.go:690] annotateNode(): Annotate node worker-2.ocp4.shiftvirt.com with: Draining_MCP_Paused I0315 09:08:01.052167 3704 daemon.go:828] pauseMCP(): MCP worker is ready I0315 09:08:01.052221 3704 daemon.go:830] pauseMCP(): stop MCP informer I0315 09:08:01.052365 3704 daemon.go:518] nodeStateSyncHandler(): drain node
And finally setting the VFs:
I0315 09:08:07.750480 3704 utils.go:417] setSriovNumVfs(): set NumVfs for device 0000:19:00.1 to 5
So this in effect requires downtime of all SR-IOV enabled VirtualMachines for an MCP update.
The customer who reported the issue is using OpenShift Virtualization where they run SR-IOV based VMs.
Version-Release number of selected component (if applicable):
4.12.1
How reproducible:
100%
Steps to Reproduce:
1. Apply a new machineconfig on MCP which contains more than one node that has sriovnetwork. 2. On the rebooted nodes the VFs are not created until all the nodes are updated where the MCP status change to r 3.
Actual results:
SR-IOV VFs are not created until all the nodes in the pools are updated
Expected results:
Able to update an MCP without downtime. As of now, the VMs are live migrated during the update to other nodes, but it cannot schedule it back to the updated node since VFs are not created after the update. So the user has to shutdown the VMs for the nodes to complete the MCP update.
Additional info:
- blocks
-
OCPBUGS-16248 SR-IOV VFs are not created until all the nodes in the pools are updated
- Closed
- is cloned by
-
OCPBUGS-16248 SR-IOV VFs are not created until all the nodes in the pools are updated
- Closed
- links to
-
RHEA-2023:5005 rpm