Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-3828

On an SNO with Telco DU profile linuxptp-daemon reports timed out while polling for tx timestamp errors

XMLWordPrintable

    • None
    • CNF RAN Sprint 229
    • 1
    • Rejected
    • False
    • Hide

      None

      Show
      None
    • Hide
      12/14: issue is not persistent; correlates to an SRIOV driver issue. A review of the ptp4l logs containing the error need to be analyzed to understand the severity.
      12/13: system test lab shared and it is not occurring in this lab. Following up with Marius for access to the specific machine which was problematic.
      Rel Note for Telco: Not Required
      Show
      12/14: issue is not persistent; correlates to an SRIOV driver issue. A review of the ptp4l logs containing the error need to be analyzed to understand the severity. 12/13: system test lab shared and it is not occurring in this lab. Following up with Marius for access to the specific machine which was problematic. Rel Note for Telco: Not Required

      Description of problem:

      On an SNO with Telco DU profile linuxptp-daemon reports timed out while polling for tx timestamp errors:
      
      oc -n openshift-ptp logs linuxptp-daemon-9fdbz -c linuxptp-daemon-container | grep 'timed'
      ptp4l[46803.358]: [ptp4l.0.config] timed out while polling for tx timestamp
      
      phc2sys[46803.194]: [ptp4l.0.config] CLOCK_REALTIME rms    4 max    4 freq  -9153 +/-   0 delay   493 +/-   0
      ptp4l[46803.227]: [ptp4l.0.config] master offset          3 s2 freq  -14064 path delay      1084
      ptp4l[46803.358]: [ptp4l.0.config] timed out while polling for tx timestamp
      ptp4l[46803.358]: [ptp4l.0.config] port 1: clearing fault immediately
      ptp4l[46803.358]: [ptp4l.0.config] increasing tx_timestamp_timeout may correct this issue, but it is likely caused by a driver bug
      ptp4l[46803.358]: [ptp4l.0.config] port 1: send delay request failed
      ptp4l[46803.400]: [ptp4l.0.config] port 1: SLAVE to LISTENING on INIT_COMPLETE
      ptp4l[46803.434]: [ptp4l.0.config] port 1: new foreign master b47af1.fffe.7b20e2-1
      ptp4l[46803.684]: [ptp4l.0.config] port 1: LISTENING to UNCALIBRATED on RS_SLAVE
      ptp4l[46803.728]: [ptp4l.0.config] master offset         10 s2 freq  -14052 path delay      1085
      ptp4l[46803.728]: [ptp4l.0.config] port 1: UNCALIBRATED to SLAVE on MASTER_CLOCK_SELECTED
      ptp4l[46803.791]: [ptp4l.0.config] master offset         -1 s2 freq  -14070 path delay      1085
      ptp4l[46803.853]: [ptp4l.0.config] master offset         12 s2 freq  -14048 path delay      1084
      ptp4l[46803.916]: [ptp4l.0.config] master offset          0 s2 freq  -14067 path delay      1084
      ptp4l[46803.978]: [ptp4l.0.config] master offset         12 s2 freq  -14047 path delay      1084
      ptp4l[46804.041]: [ptp4l.0.config] master offset         -4 s2 freq  -14073 path delay      1084
      ptp4l[46804.103]: [ptp4l.0.config] master offset         -7 s2 freq  -14078 path delay      1084
      ptp4l[46804.166]: [ptp4l.0.config] master offset          4 s2 freq  -14060 path delay      1086
      phc2sys[46804.194]: [ptp4l.0.config] port b49691.fffe.a57b06-1 changed state
      phc2sys[46804.194]: [ptp4l.0.config] port b49691.fffe.a57b06-1 changed state
      phc2sys[46804.194]: [ptp4l.0.config] port b49691.fffe.a57b06-1 changed state
      phc2sys[46804.194]: [ptp4l.0.config] reconfiguring after port state change
      phc2sys[46804.194]: [ptp4l.0.config] selecting CLOCK_REALTIME for synchronization
      phc2sys[46804.195]: [ptp4l.0.config] selecting ens2f2 as the master clock
      

      Version-Release number of selected component (if applicable):

      4.12.0-rc.0
      ptp-operator.4.11.0-202210262118

      How reproducible:

      Most of the times. The issue does not immediately reproduce but can be seen while running system tests.

      Steps to Reproduce:

      1. Deploy and configure an SNO with Telco DU profile
      2. Run system tests which create various workloads and reboot system several times
      3. Check linuxptp-daemon logs for time outs
      
      oc -n openshift-ptp logs linuxptp-daemon-9fdbz -c linuxptp-daemon-container | grep 'timed' 

      Actual results:

      One or more occurences

      Expected results:

      No timeouts

      Additional info:

      PTP master is running on a RHEL bare metal server via ptp4l. The SNO is connected to the same switch as the RHEL bare metal server.
      
      Attaching must-gather and PTPConfig.

        1. ptp4l.conf
          2 kB
        2. ptpconfig.yaml
          3 kB

              josricha@redhat.com Joseph Richard
              mcornea@redhat.com Marius Cornea
              Ofer Bochan Ofer Bochan
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

                Created:
                Updated:
                Resolved: