Uploaded image for project: 'RHEL'
  1. RHEL
  2. RHEL-32907

[ptp4l]sync failed under high overload network environment

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Normal Normal
    • None
    • rhel-8.10
    • linuxptp
    • None
    • None
    • rhel-sst-cs-stacks
    • ssg_core_services
    • None
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • None
    • None
    • x86_64
    • None

      What were you trying to do that didn't work?

      Use TC to simulate high-load network scenarios. ptp4l sync failed.

      Please provide the package NVR for which bug is seen:

      distro:RHEL-8.10.0-20240412.55

      linuxptp-4.2-1.el8.x86_64

      1. ethtool -i ens3f0
        driver: ice
        version: 4.18.0-552.rt7.341.el8_10.x86_6
        firmware-version: 4.20 0x8001778c 1.3346.0
        expansion-rom-version: 
        bus-info: 0000:13:00.0
        supports-statistics: yes
        supports-test: yes
        supports-eeprom-access: yes
        supports-register-dump: yes
        supports-priv-flags: yes

        How reproducible: 100%

        Steps to reproduce

      1.start ptp4l as OC on server side

      ptp4l -f /usr/share/doc/linuxptp/configs/default.cfg -EH2mi ens3f0 --domainNumber 30 --priority1=112 --priority2=255 --clockClass=248 --clockAccuracy=0xFF --offsetScaledLogVariance=0x0000

      2.start ptp4l as GM on client side

      ptp4l -f /usr/share/doc/linuxptp/configs/default.cfg -EH2mi ens3f0 --domainNumber 30 --priority1=1-priority2=255 --clockClass=248 --clockAccuracy=0xFF --offsetScaledLogVariance=0x0000

      3. add loss rate to 50% on server side

      tc qdisc add dev ens3f0 root netem loss 50%

      4. find sync failed on server side

      ptp4l[115658.289]: master offset          6 s2 freq    +151 path delay      2228
      ptp4l[115659.289]: master offset         16 s2 freq    +163 path delay      2229
      ptp4l[115660.289]: master offset         16 s2 freq    +168 path delay      2227
      ptp4l[115661.289]: master offset        -41 s2 freq    +115 path delay      2227
      ptp4l[115662.289]: master offset          6 s2 freq    +150 path delay      2227
      ptp4l[115663.289]: master offset         37 s2 freq    +183 path delay      2226
      ptp4l[115664.289]: master offset        -37 s2 freq    +120 path delay      2226
      ptp4l[115665.289]: master offset         26 s2 freq    +172 path delay      2227
      ptp4l[115666.289]: master offset        -28 s2 freq    +126 path delay      2227
      ptp4l[115667.289]: master offset         23 s2 freq    +168 path delay      2227
      ptp4l[115668.289]: master offset          3 s2 freq    +155 path delay      2228
      ptp4l[115669.289]: master offset          8 s2 freq    +161 path delay      2227
      ptp4l[115670.289]: master offset          6 s2 freq    +162 path delay      2227
      ptp4l[115671.289]: master offset        -18 s2 freq    +139 path delay      2234
      ptp4l[115672.289]: master offset        -14 s2 freq    +138 path delay      2234
      ptp4l[115673.234]: timed out while polling for tx timestamp
      ptp4l[115673.235]: increasing tx_timestamp_timeout or increasing kworker priority may correct this issue, but a driver bug likely causes it
      ptp4l[115673.235]: port 1 (ens3f0): send delay request failed
      ptp4l[115673.235]: port 1 (ens3f0): SLAVE to FAULTY on FAULT_DETECTED (FT_UNSPECIFIED)
      ptp4l[115689.262]: port 1 (ens3f0): FAULTY to LISTENING on INIT_COMPLETE
      ptp4l[115689.289]: port 1 (ens3f0): new foreign master 40a6b7.fffe.3ea560-1
      ptp4l[115693.289]: port 1 (ens3f0): LISTENING to UNCALIBRATED on RS_SLAVE
      ptp4l[115694.289]: master offset        542 s2 freq    +690 path delay      2232
      ptp4l[115694.289]: port 1 (ens3f0): UNCALIBRATED to SLAVE on MASTER_CLOCK_SELECTED
      ptp4l[115695.289]: master offset         44 s2 freq    +354 path delay      2237
      ptp4l[115695.680]: timed out while polling for tx timestamp
      ptp4l[115695.680]: increasing tx_timestamp_timeout or increasing kworker priority may correct this issue, but a driver bug likely causes it
      ptp4l[115695.680]: port 1 (ens3f0): send delay request failed
      ptp4l[115695.680]: port 1 (ens3f0): SLAVE to FAULTY on FAULT_DETECTED (FT_UNSPECIFIED)
      ptp4l[115711.707]: port 1 (ens3f0): FAULTY to LISTENING on INIT_COMPLETE
      ptp4l[115713.289]: port 1 (ens3f0): new foreign master 40a6b7.fffe.3ea560-1
       

      Expected results

      ptp4l should print large path delay, but shouldn't sync failed.

      Actual results

      OC sync failed with GM.

              rhn-support-mlichvar Miroslav Lichvar
              mhou@redhat.com Minxi Hou
              Miroslav Lichvar Miroslav Lichvar
              Yalin Li Yalin Li
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated:
                Resolved: