Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-62719

T-TSC/T-BC Holdover: Not all clock state metrics degrade after losing upstream clock

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Normal Normal
    • None
    • 4.20
    • Networking / ptp
    • None
    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • Important
    • None
    • x86_64
    • 10/30 - Waiting on QE to verify.
    • None
    • Agent Sprint 277, CNF RAN Sprint 278, CNF RAN Sprint 279
    • 3
    • Proposed
    • Known Issue
    • Hide
      *Cause*: T-BC state changes to "unlocked" after exhausting the holdover budget
      *Consequence*: The ptp4l process metric of the TR port shows "locked" when T-BC state is "unlocked"
      *Fix*: Add ptp4l metric alignment functionality
      *Result*: The ptp4l process metric of the TR port is aligned with the T-BC state.

      Proposed text:
      The loss of the upstream clock connection does not trigger the degradation of all clock state metrics due to a bug in the clock degradation logic. Consequently, clock state metrics do not degrade as expected after losing upstream clock, leading to inconsistent time synchronization. To work around this issue, disconnect and reconnect the upstream clock. Clock metrics might not degrade as expected during disconnection, but should recover after reconnection.
      Show
      *Cause*: T-BC state changes to "unlocked" after exhausting the holdover budget *Consequence*: The ptp4l process metric of the TR port shows "locked" when T-BC state is "unlocked" *Fix*: Add ptp4l metric alignment functionality *Result*: The ptp4l process metric of the TR port is aligned with the T-BC state. Proposed text: The loss of the upstream clock connection does not trigger the degradation of all clock state metrics due to a bug in the clock degradation logic. Consequently, clock state metrics do not degrade as expected after losing upstream clock, leading to inconsistent time synchronization. To work around this issue, disconnect and reconnect the upstream clock. Clock metrics might not degrade as expected during disconnection, but should recover after reconnection.
    • None
    • None
    • None
    • None

      Description of problem:

          Not all clock state metrics degrade after losing upstream clock 

      Version-Release number of selected component (if applicable):

          4.20.0-202510021807

      How reproducible:

          100%

      Steps to Reproduce:

          1.Loss connection to upstream clock; ip link set ens2f3 down
          2.Wait for clock to degrade, get metrics; watch 'oc -n openshift-ptp exec ds/linuxptp-daemon -c cloud-event-proxy -- curl -s localhost:9091/metrics | grep clock

      Actual results:

      # HELP openshift_ptp_clock_class 6 = Locked, 7 = PRC unlocked in-spec, 52/187 = PRC unlocked out-of-spec, 135 = T-BC holdover in-spec, 165 = T-BC holdover out-of-spec, 248 = Default, 255 = Slave Only Clock
      # TYPE openshift_ptp_clock_class gauge
      openshift_ptp_clock_class{node="helix65.lab.eng.rdu2.redhat.com",process="ptp4l"} 248
      # HELP openshift_ptp_clock_state 0 = FREERUN, 1 = LOCKED, 2 = HOLDOVER
      # TYPE openshift_ptp_clock_state gauge
      openshift_ptp_clock_state{iface="CLOCK_REALTIME",node="helix65.lab.eng.rdu2.redhat.com",process="phc2sys"} 1
      openshift_ptp_clock_state{iface="ens1fx",node="helix65.lab.eng.rdu2.redhat.com",process="dpll"} 1
      openshift_ptp_clock_state{iface="ens1fx",node="helix65.lab.eng.rdu2.redhat.com",process="ts2phc"} 1
      openshift_ptp_clock_state{iface="ens2fx",node="helix65.lab.eng.rdu2.redhat.com",process="T-BC"} 0
      openshift_ptp_clock_state{iface="ens2fx",node="helix65.lab.eng.rdu2.redhat.com",process="dpll"} 0
      openshift_ptp_clock_state{iface="ens2fx",node="helix65.lab.eng.rdu2.redhat.com",process="ptp4l"} 1
      openshift_ptp_clock_state{iface="ens2fx",node="helix65.lab.eng.rdu2.redhat.com",process="ts2phc"} 1    

      Expected results:

      # HELP openshift_ptp_clock_class 6 = Locked, 7 = PRC unlocked in-spec, 52/187 = PRC unlocked out-of-spec, 135 = T-BC holdover in-spec, 165 = T-BC holdover out-of-spec, 248 = Default, 255 = Slave Only Clock
      # TYPE openshift_ptp_clock_class gauge
      openshift_ptp_clock_class{node="helix65.lab.eng.rdu2.redhat.com",process="ptp4l"} 248
      # HELP openshift_ptp_clock_state 0 = FREERUN, 1 = LOCKED, 2 = HOLDOVER
      # TYPE openshift_ptp_clock_state gauge
      openshift_ptp_clock_state{iface="CLOCK_REALTIME",node="helix65.lab.eng.rdu2.redhat.com",process="phc2sys"} 1
      openshift_ptp_clock_state{iface="ens1fx",node="helix65.lab.eng.rdu2.redhat.com",process="dpll"} 1
      openshift_ptp_clock_state{iface="ens1fx",node="helix65.lab.eng.rdu2.redhat.com",process="ts2phc"} 1
      openshift_ptp_clock_state{iface="ens2fx",node="helix65.lab.eng.rdu2.redhat.com",process="T-BC"} 0
      openshift_ptp_clock_state{iface="ens2fx",node="helix65.lab.eng.rdu2.redhat.com",process="dpll"} 0
      openshift_ptp_clock_state{iface="ens2fx",node="helix65.lab.eng.rdu2.redhat.com",process="ptp4l"} 0
      openshift_ptp_clock_state{iface="ens2fx",node="helix65.lab.eng.rdu2.redhat.com",process="ts2phc"} 0

      Additional info:

          

              vgrinber@redhat.com Vitaly Grinberg
              rh-ee-dpopsuev Daniel Popsuevich
              None
              None
              Daniel Popsuevich Daniel Popsuevich
              Lluis Cavalle Lluis Cavalle
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

                Created:
                Updated: