-
Bug
-
Resolution: Unresolved
-
Normal
-
4.16.0
Description of problem:
PTP events loses connectivity between producer and consumer when external interface is lost.
When SNO OAM interface is used for PTP communication with the GM. If that link is pulled/restored it can be seen that the events which were attempted to be sent by the server side-car in the PTP daemon pod (associated with the link pull) fail to be delivered as long as the link remains down. Once the link is restored, events (associated with the link restore) are delivered fine. Given the server/client side-cars are on the same node, I would have expected this to be node internal communication vs. being tied to the OAM interface (and not to drop any events to the client).
When the OAM link is down (previous test), there is a disconcerting message that shows up in the server side-car log: "not responding, waiting 6 times before marking to delete subscriber". This kind of implies that a communication error will cause the subscription to be removed? For this case, how would the client know its subscription was removed, and will no longer get events? But, even though that message comes out... it doesn't look like it removed the subscription (nor is there anything about a retry in the log either).
The SNO OAM interface is the physical nic link that is connected to the management network they are using. The management/OAM network connects to their management system back in the central office.
How reproducible:
Steps to Reproduce:
1.Get management interface down; which pauses the network 2.Check no events are received 3.
Actual results:
Expected results:
Additional info:
- blocks
-
OCPBUGS-39427 PTP events loses connectivity between producer and consumer when external interface is lost
- Closed
- is cloned by
-
OCPBUGS-39427 PTP events loses connectivity between producer and consumer when external interface is lost
- Closed
- links to
-
RHEA-2024:6122 OpenShift Container Platform 4.18.z bug fix update