-
Bug
-
Resolution: Unresolved
-
Normal
-
None
-
4.15
-
Quality / Stability / Reliability
-
True
-
-
None
-
Important
-
None
-
-
None
-
None
-
CNF RAN Sprint 277
-
1
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
Version-Release number of selected component (if applicable):
PTP operator version 4.15.0-202505132237.
How reproducible:
Systemic, but random.
Steps to Reproduce:
1.install ptp operator, upgrade openshift from 4.14 to 4.15 2.pod can get stuck in an error state and not refresh itself. Needs monitoring so that it will automatically cycle.
Actual results:
This morning we had an outage that blocked trading activities due to clock sync errors, the root cause of which was that the linux PTP pods scheduled by the PTP operator were not working correctly. The pods were in healthy state according to OCP, while printing the following lines few times a second: ``` 2025-06-02 00:33:25.106 phc2sys[150296.077]: [ptp4l.0.config] Waiting for ptp4l... 2025-06-02 00:33:25.546 I0602 00:33:25.546967 393813 daemon.go:745] Starting ptp4l... 2025-06-02 00:33:25.546 I0602 00:33:25.546939 393813 daemon.go:844] Recreating ptp4l... 2025-06-02 00:33:25.547 I0602 00:33:25.546974 393813 daemon.go:746] ptp4l cmd: /bin/chrt -f 10 /usr/sbin/ptp4l -f /var/run/ptp4l.0.config -s -m 2025-06-02 00:33:25.547 2025-06-02 00:33:25.547 I0602 00:33:25.547052 393813 daemon.go:674] ptp4l[1748824405]:[ptp4l.0.config] PTP_PROCESS_STATUS:1 2025-06-02 00:33:25.548 failed to parse configuration file /var/run/ptp4l.0.config 2025-06-02 00:33:25.548 line 72: missing table_id 2025-06-02 00:33:25.548 E0602 00:33:25.548539 393813 daemon.go:829] CmdRun() error waiting for ptp4l: exit status 254 2025-06-02 00:33:25.548 2025-06-02 00:33:25.548 I0602 00:33:25.548583 393813 daemon.go:674] ptp4l[1748824405]:[ptp4l.0.config] PTP_PROCESS_STATUS:0 2025-06-02 00:33:26.106 phc2sys[150297.077]: [ptp4l.0.config] Waiting for ptp4l... 2025-06-02 00:33:26.548 I0602 00:33:26.548933 393813 daemon.go:745] Starting ptp4l... 2025-06-02 00:33:26.548 I0602 00:33:26.548904 393813 daemon.go:844] Recreating ptp4l... 2025-06-02 00:33:26.548 I0602 00:33:26.548942 393813 daemon.go:746] ptp4l cmd: /bin/chrt -f 10 /usr/sbin/ptp4l -f /var/run/ptp4l.0.config -s -m 2025-06-02 00:33:26.549 2025-06-02 00:33:26.549 I0602 00:33:26.548998 393813 daemon.go:674] ptp4l[1748824406]:[ptp4l.0.config] PTP_PROCESS_STATUS:1 2025-06-02 00:33:26.550 failed to parse configuration file /var/run/ptp4l.0.config 2025-06-02 00:33:26.550 line 72: missing table_id ```
This was fixed by a simple pod restart - configurations were not changed, whatever the problem was, was likely a temporary condition caused by the upgrade. Another node in the same cluster with the same pod and same configurations never had this issue.
Expected results:
Pod should automatically know it is erred and restart.
Additional info: