Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-62906

Missing PTP process status from metrics

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Normal Normal
    • None
    • 4.18, 4.19, 4.20
    • Networking / ptp
    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • Important
    • None
    • All
    • 2025-10-16: In progress
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Description of problem:

      ptp4l process is missing the process status metric in the linuxptp-daemon

      Version-Release number of selected component (if applicable):

      OCP 4.20.0-rc.3 + ptp_operator 4.20.0-202509080953
      OCP 4.20.0-ec.4 + ptp_operator 4.20.0-202507221345
      
      OCP 4.19.14 + ptp_operator 4.19.0-202509230113
      OCP 4.19.12 + ptp_operator 4.19.0-202509111607
      OCP 4.19.5 + ptp_operator 4.19.0-202507232110
      
      OCP 4.18.21 + ptp_operator 4.18.0-202507211933
      
      

      How reproducible:

      Completely random 

      Steps to Reproduce:

          1. Deploy the spoke cluster
          2. Run the test_ptp.sh script --> https://gitlab.cee.redhat.com/ran/ran-integration/-/blob/master/scripts/test_ptp.sh?ref_type=heads 
      
          

      Actual results:

      The linuxptp-daemon pod does not provide the ptp4l process status metric
      [kni@registry.kni-qe-23 ~]$ oc -n openshift-ptp exec linuxptp-daemon-rzsc5 -- curl http://localhost:9091/metrics | grep ptp4lDefaulted container "cloud-event-proxy" out of: cloud-event-proxy, kube-rbac-proxy, linuxptp-daemon-container  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current                                 Dload  Upload   Total   Spent    Left  Speed100  4741    0  4741    0     0  4629k      0 --:--:-- --:--:-- --:--:-- 4629kopenshift_ptp_clock_class{node="sno.kni-qe-67.lab.eng.rdu2.redhat.com",process="ptp4l"} 6openshift_ptp_clock_state{iface="ens1fx",node="sno.kni-qe-67.lab.eng.rdu2.redhat.com",process="ptp4l"} 1openshift_ptp_delay_ns{from="master",iface="ens1fx",node="sno.kni-qe-67.lab.eng.rdu2.redhat.com",process="ptp4l"} 360openshift_ptp_frequency_adjustment_ns{from="master",iface="ens1fx",node="sno.kni-qe-67.lab.eng.rdu2.redhat.com",process="ptp4l"} 11openshift_ptp_interface_role{iface="ens1f0",node="sno.kni-qe-67.lab.eng.rdu2.redhat.com",process="ptp4l"} 2openshift_ptp_interface_role{iface="ens1f1",node="sno.kni-qe-67.lab.eng.rdu2.redhat.com",process="ptp4l"} 1openshift_ptp_interface_role{iface="ens1f2",node="sno.kni-qe-67.lab.eng.rdu2.redhat.com",process="ptp4l"} 2openshift_ptp_interface_role{iface="ens1f3",node="sno.kni-qe-67.lab.eng.rdu2.redhat.com",process="ptp4l"} 2openshift_ptp_max_offset_ns{from="master",iface="ens1fx",node="sno.kni-qe-67.lab.eng.rdu2.redhat.com",process="ptp4l"} 2.71339e+06openshift_ptp_offset_ns{from="master",iface="ens1fx",node="sno.kni-qe-67.lab.eng.rdu2.redhat.com",process="ptp4l"} 1    

      Expected results:

       The expected result is actually obtain if we restart the pod, it should look like this
         [kni@registry.kni-qe-23 ~]$ oc -n openshift-ptp exec linuxptp-daemon-4vdvs -- curl http://localhost:9091/metrics | grep ptp4l
      Defaulted container "cloud-event-proxy" out of: cloud-event-proxy, kube-rbac-proxy, linuxptp-daemon-container
        % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                       Dload  Upload   Total   Spent    Left  Speed
      100  5391    0  5391    0     0  5264k      0 --:--:-- --:--:-- --:--:-- 5264k
      openshift_ptp_clock_class{node="sno.kni-qe-67.lab.eng.rdu2.redhat.com",process="ptp4l"} 6
      openshift_ptp_clock_state{iface="ens1fx",node="sno.kni-qe-67.lab.eng.rdu2.redhat.com",process="ptp4l"} 1
      openshift_ptp_delay_ns{from="master",iface="ens1fx",node="sno.kni-qe-67.lab.eng.rdu2.redhat.com",process="ptp4l"} 366
      openshift_ptp_frequency_adjustment_ns{from="master",iface="ens1fx",node="sno.kni-qe-67.lab.eng.rdu2.redhat.com",process="ptp4l"} 7
      openshift_ptp_interface_role{iface="ens1f0",node="sno.kni-qe-67.lab.eng.rdu2.redhat.com",process="ptp4l"} 2
      openshift_ptp_interface_role{iface="ens1f1",node="sno.kni-qe-67.lab.eng.rdu2.redhat.com",process="ptp4l"} 1
      openshift_ptp_interface_role{iface="ens1f2",node="sno.kni-qe-67.lab.eng.rdu2.redhat.com",process="ptp4l"} 2
      openshift_ptp_interface_role{iface="ens1f3",node="sno.kni-qe-67.lab.eng.rdu2.redhat.com",process="ptp4l"} 2
      openshift_ptp_max_offset_ns{from="master",iface="ens1fx",node="sno.kni-qe-67.lab.eng.rdu2.redhat.com",process="ptp4l"} 116
      openshift_ptp_offset_ns{from="master",iface="ens1fx",node="sno.kni-qe-67.lab.eng.rdu2.redhat.com",process="ptp4l"} 1
      openshift_ptp_process_restart_count{config="ptp4l.0.config",node="sno.kni-qe-67.lab.eng.rdu2.redhat.com",process="phc2sys"} 1
      openshift_ptp_process_restart_count{config="ptp4l.0.config",node="sno.kni-qe-67.lab.eng.rdu2.redhat.com",process="ptp4l"} 1
      openshift_ptp_process_status{config="ptp4l.0.config",node="sno.kni-qe-67.lab.eng.rdu2.redhat.com",process="phc2sys"} 1
      openshift_ptp_process_status{config="ptp4l.0.config",node="sno.kni-qe-67.lab.eng.rdu2.redhat.com",process="ptp4l"} 1

      Additional info:

      I have encounter this problem randomly affecting different environments, architectures (aarch64 and x86_64) and combinations of OCP and ptp_operator versions. 
      
      This is the must-gather of the latest deployment where I have seen this happening.
      
      https://drive.google.com/file/d/1XYR5kI0k2ccExnHS7g7Hg20fs5nvc2_2/view?usp=sharing    

              josricha@redhat.com Joseph Richard
              amendiol@redhat.com Alaitz Mendiola
              None
              None
              Yang Liu Yang Liu
              None
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated: