Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-36235

ptp metrics show slave interfaces as master when sync quality is low.

XMLWordPrintable

    • Moderate
    • No
    • False
    • Hide

      None

      Show
      None
    • 8/16: Reported by QE ; Waiting for response from QE .looks like ptp switch configuration issue .

      Description of problem:

      Metrics show slave interfaces as master when sync quality is low.    

      Version-Release number of selected component (if applicable):

          ptp-operator.v4.16.0-202406200537

      How reproducible:

      100%    

      Steps to Reproduce:

          1. Deploy 4.16 SNO with ptp-operator and BC/OC ptpconfigs
          2. Check metrics for role and clock_state
          

      Actual results:

      ptpconfig boundary has two slave interfaces configured, ens1f3 and ens2f3:

      [kni@registry.kni-qe-73 ~]$ oc get ptpconfigs.ptp.openshift.io -n openshift-ptp boundary -o yaml 
      [...]
      spec:
        profile:
        - name: bc2
          phc2sysOpts: ""
          ptp4lConf: |
            # The interface name is hardware-specific
            [ens2f3]
            masterOnly 0
            [ens2f1]
            masterOnly 1
            [ens2f2]
            masterOnly 1
            [ens2f0]
            masterOnly 1
      [...]
        - name: bc1
          phc2sysOpts: ""
          ptp4lConf: |
            # The interface name is hardware-specific
            [ens1f3]
            masterOnly 0
            [ens1f1]
            masterOnly 1
            [ens1f2]
            masterOnly 1
            [ens1f0]
            masterOnly 1
            
      

      Metrics is showing these interfaces as master:

      [kni@registry.kni-qe-73 ~]$ oc exec -it ds/linuxptp-daemon  -n openshift-ptp -c linuxptp-daemon-container -- curl -s localhost:9091/metrics | grep role
      # HELP openshift_ptp_interface_role 0 = PASSIVE, 1 = SLAVE, 2 = MASTER, 3 = FAULTY, 4 = UNKNOWN, 5 = LISTENING
      # TYPE openshift_ptp_interface_role gauge
      openshift_ptp_interface_role{iface="eno12399np0",node="helix66.lab.eng.rdu2.redhat.com",process="ptp4l"} 1
      openshift_ptp_interface_role{iface="eno12409np1",node="helix66.lab.eng.rdu2.redhat.com",process="ptp4l"} 1
      openshift_ptp_interface_role{iface="ens1f0",node="helix66.lab.eng.rdu2.redhat.com",process="ptp4l"} 2
      openshift_ptp_interface_role{iface="ens1f1",node="helix66.lab.eng.rdu2.redhat.com",process="ptp4l"} 3
      openshift_ptp_interface_role{iface="ens1f2",node="helix66.lab.eng.rdu2.redhat.com",process="ptp4l"} 3
      openshift_ptp_interface_role{iface="ens1f3",node="helix66.lab.eng.rdu2.redhat.com",process="ptp4l"} 2
      openshift_ptp_interface_role{iface="ens2f0",node="helix66.lab.eng.rdu2.redhat.com",process="ptp4l"} 2
      openshift_ptp_interface_role{iface="ens2f1",node="helix66.lab.eng.rdu2.redhat.com",process="ptp4l"} 3
      openshift_ptp_interface_role{iface="ens2f2",node="helix66.lab.eng.rdu2.redhat.com",process="ptp4l"} 3
      openshift_ptp_interface_role{iface="ens2f3",node="helix66.lab.eng.rdu2.redhat.com",process="ptp4l"} 2
      

      No clock_state is shown for these interfaces.
      If I change the clock_class_threshold from 135 to 248, the interfaces do show up as slaves and clock_state is reported:

      [kni@registry.kni-qe-73 ~]$ oc get ptpconfigs.ptp.openshift.io -n openshift-ptp boundary -o yaml | grep -i threshold
            clock_class_threshold 248
            step_threshold 2.0
            first_step_threshold 0.00002
            clock_class_threshold 248
            step_threshold 2.0
            first_step_threshold 0.00002# TYPE openshift_ptp_interface_role gauge
      openshift_ptp_interface_role{iface="eno12399np0",node="helix66.lab.eng.rdu2.redhat.com",process="ptp4l"} 1
      openshift_ptp_interface_role{iface="eno12409np1",node="helix66.lab.eng.rdu2.redhat.com",process="ptp4l"} 1
      openshift_ptp_interface_role{iface="ens1f0",node="helix66.lab.eng.rdu2.redhat.com",process="ptp4l"} 2
      openshift_ptp_interface_role{iface="ens1f1",node="helix66.lab.eng.rdu2.redhat.com",process="ptp4l"} 3
      openshift_ptp_interface_role{iface="ens1f2",node="helix66.lab.eng.rdu2.redhat.com",process="ptp4l"} 3
      openshift_ptp_interface_role{iface="ens1f3",node="helix66.lab.eng.rdu2.redhat.com",process="ptp4l"} 1
      openshift_ptp_interface_role{iface="ens2f0",node="helix66.lab.eng.rdu2.redhat.com",process="ptp4l"} 2
      openshift_ptp_interface_role{iface="ens2f1",node="helix66.lab.eng.rdu2.redhat.com",process="ptp4l"} 3
      openshift_ptp_interface_role{iface="ens2f2",node="helix66.lab.eng.rdu2.redhat.com",process="ptp4l"} 3
      openshift_ptp_interface_role{iface="ens2f3",node="helix66.lab.eng.rdu2.redhat.com",process="ptp4l"} 1
      # TYPE openshift_ptp_clock_state gauge
      openshift_ptp_clock_state{iface="CLOCK_REALTIME",node="helix66.lab.eng.rdu2.redhat.com",process="phc2sys"} 1
      openshift_ptp_clock_state{iface="eno12399npx",node="helix66.lab.eng.rdu2.redhat.com",process="ptp4l"} 1
      openshift_ptp_clock_state{iface="eno12409npx",node="helix66.lab.eng.rdu2.redhat.com",process="ptp4l"} 1
      openshift_ptp_clock_state{iface="ens1fx",node="helix66.lab.eng.rdu2.redhat.com",process="ptp4l"} 1
      openshift_ptp_clock_state{iface="ens2fx",node="helix66.lab.eng.rdu2.redhat.com",process="ptp4l"} 1
      

      Expected results:

      Interfaces configured as slave should always show slave role in metrics.    

      Additional info:
      linuxptp-daemon log errors:

      [kni@registry.kni-qe-73 ~]$ oc logs -n openshift-ptp linuxptp-daemon-ls5fj linuxptp-daemon-container --since=5m | grep ens1f3
      ptp4l[10311.003]: [ptp4l.0.config:3] port 1 (ens1f3): Master clock quality received is greater than configured, ignoring master!
      ptp4l[10371.003]: [ptp4l.0.config:3] port 1 (ens1f3): Master clock quality received is greater than configured, ignoring master!
      ptp4l[10431.003]: [ptp4l.0.config:3] port 1 (ens1f3): Master clock quality received is greater than configured, ignoring master!
      ptp4l[10491.003]: [ptp4l.0.config:3] port 1 (ens1f3): Master clock quality received is greater than configured, ignoring master!
      ptp4l[10551.003]: [ptp4l.0.config:3] port 1 (ens1f3): Master clock quality received is greater than configured, ignoring master!
      
      [kni@registry.kni-qe-73 ~]$ oc logs -n openshift-ptp linuxptp-daemon-ls5fj linuxptp-daemon-container --since=5m | grep ens2f3
      ptp4l[10311.003]: [ptp4l.1.config:3] port 1 (ens2f3): Master clock quality received is greater than configured, ignoring master!
      ptp4l[10371.003]: [ptp4l.1.config:3] port 1 (ens2f3): Master clock quality received is greater than configured, ignoring master!
      ptp4l[10431.003]: [ptp4l.1.config:3] port 1 (ens2f3): Master clock quality received is greater than configured, ignoring master!
      ptp4l[10491.003]: [ptp4l.1.config:3] port 1 (ens2f3): Master clock quality received is greater than configured, ignoring master!
      ptp4l[10551.003]: [ptp4l.1.config:3] port 1 (ens2f3): Master clock quality received is greater than configured, ignoring master!
      

            aputtur@redhat.com Aneesh Puttur
            bblock@redhat.com Bonnie Block
            Bonnie Block Bonnie Block
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: