-
Bug
-
Resolution: Done
-
Major
-
4.13.z, 4.12.z, 4.14.z, 4.15
-
Quality / Stability / Reliability
-
False
-
-
1
-
Moderate
-
No
-
2024-02-19: Peer review
-
None
-
None
-
None
-
T&PS 2024 #2
-
1
-
None
-
Release Note Not Required
-
N/A
-
None
-
None
-
None
-
None
Description of problem:
Documentation does not adequately explain that when you are defining a change to a network object that relies on a DIFFERENT network object that is also defined by nmstate operator, you must create a unified nncp object to avoid race-conditions that result in degraded network state. EXAMPLE: You cannot define a VLAN interface in one nncp object, and then define all the routes for that VLAN in a separate (or set of separate) nncp objects. This can and does lead to a race-condition, wherein the when the node restarts, the routes may be applied before the vlan is present, leading to the route nncp objects becoming degraded and requiring to be re-applied to resolve. Instead, there should be 1 nncp object that defines both the VLAN, AND, the routes for said vlan to be defined in one shot. KEY TAKEAWAY: There is no way to control the order in which NNCP objects are applied, and if these objects are interdependent, they must be defined at the same time in the same nncp yaml definition.
Version-Release number of selected component (if applicable):
all currently supported releases
How reproducible:
every time
Steps to Reproduce:
1. define NNCP object that sets up a VLAN interface 2. define a separate NNCP object that defines routes for said interface 3. observe that on first-application, the routes are assigned successfully 4. reboot node and observe that the route nncp object is degraded 5. see that the error code identified that the VLAN did not exist/was not ready 6. Observe that the VLAN is UP/READY on the node (but was not when the routes attempted to be configured)
Actual results:
network configuration may become degraded
Expected results:
networking definitions should not moved to degraded unless there is an underlying issue with hardware or misconfiguration, not due to possible pitfall identified above.
Additional info:
see KCS: https://access.redhat.com/solutions/7032141
- links to