-
Bug
-
Resolution: Unresolved
-
Critical
-
None
-
4.14
-
Quality / Stability / Reliability
-
False
-
-
None
-
Critical
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
In customer environment, NetworkManager reaches to max unsigned int limit the serial numbers of the messages sent to DBUS. As a result it gets terminated and customer has to drain and reboot the node to restore functionality. Serial overflow is a know issue and there are jira tickets already for rhel and networkmanager: https://issues.redhat.com/browse/RHEL-4139 https://issues.redhat.com/browse/NMT-1884 In RHEL-4139, we notified the team that this issue is causing problems in openshift and they are planning to backport the fix to all past RHEL versions used in OpenShift RHCOS. We need help: 1. To arrange the fix to be incorporated into OpenShift when it is backported 2. To understand if there is an existing OpenShift release which is based on a fixed RHEL version 3. To find a less disruptive workaround then rebooting the node: - NetworkManager restart? (changes the default dbus peer id 1.3 to a new one)
Version-Release number of selected component (if applicable):
OpenShift 4.14
How reproducible:
Whenever serial number reaches the limit the issue occurs. Not practical to reproduce manually.