Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-12361

PTP metrics - Unexpected metrics for old phc2sys appears in metrics after modify ptpconfigs

XMLWordPrintable

    • Moderate
    • None
    • False
    • Hide

      None

      Show
      None

      Description of problem:

      Unexpected/outdated phc2sys related metrics are showing up in ptp metrics after modifying ptpconfigs. phc2sys can be associated to a different ptp4l config after ptpconfig change.
      
      Both old and new phc2sys metrics show up in ptp metrics, with the old metrics showing up with process status down, etc. 
      
      The metrics for old phc2sys should not show up in ptp metrics.  

      Version-Release number of selected component (if applicable):

      4.12

      How reproducible:

      100% (after phc2sys moves from one ptp4l config to another)

      Steps to Reproduce:

      1. Configure at least 2 ptp profiles
      2. Modify ptpconfigs to change ptp thresholds 
      3. Observe that phc2sys process got restarted and moved from one ptpconfig (ptp4l.2.config) to another one (ptp4l.1.config). 
      4. Check ptp metrics

      Actual results:

      old phc2sys metics associated with ptp4l.2.config remains in ptp metrics.  # HELP openshift_ptp_process_restart_count 
      
      # TYPE openshift_ptp_process_restart_count counter
      openshift_ptp_process_restart_count{config="ptp4l.0.config",node="helix49.ptp.lab.eng.bos.redhat.com",process="ptp4l"} 2
      openshift_ptp_process_restart_count{config="ptp4l.1.config",node="helix49.ptp.lab.eng.bos.redhat.com",process="phc2sys"} 1
      openshift_ptp_process_restart_count{config="ptp4l.1.config",node="helix49.ptp.lab.eng.bos.redhat.com",process="ptp4l"} 2
      openshift_ptp_process_restart_count{config="ptp4l.2.config",node="helix49.ptp.lab.eng.bos.redhat.com",process="phc2sys"} 1
      openshift_ptp_process_restart_count{config="ptp4l.2.config",node="helix49.ptp.lab.eng.bos.redhat.com",process="ptp4l"} 2
      openshift_ptp_process_restart_count{config="ptp4l.3.config",node="helix49.ptp.lab.eng.bos.redhat.com",process="ptp4l"} 2
      # HELP openshift_ptp_process_status 0 = DOWN, 1 = UP
      # TYPE openshift_ptp_process_status gauge
      openshift_ptp_process_status{config="ptp4l.0.config",node="helix49.ptp.lab.eng.bos.redhat.com",process="ptp4l"} 1
      openshift_ptp_process_status{config="ptp4l.1.config",node="helix49.ptp.lab.eng.bos.redhat.com",process="phc2sys"} 1
      openshift_ptp_process_status{config="ptp4l.1.config",node="helix49.ptp.lab.eng.bos.redhat.com",process="ptp4l"} 1
      openshift_ptp_process_status{config="ptp4l.2.config",node="helix49.ptp.lab.eng.bos.redhat.com",process="phc2sys"} 0
      openshift_ptp_process_status{config="ptp4l.2.config",node="helix49.ptp.lab.eng.bos.redhat.com",process="ptp4l"} 1
      openshift_ptp_process_status{config="ptp4l.3.config",node="helix49.ptp.lab.eng.bos.redhat.com",process="ptp4l"} 1
      # HELP openshift_ptp_threshold 
      # TYPE openshift_ptp_threshold gauge
      openshift_ptp_threshold{node="helix49.ptp.lab.eng.bos.redhat.com",profile="bc1",threshold="HoldOverTimeout"} 5
      openshift_ptp_threshold{node="helix49.ptp.lab.eng.bos.redhat.com",profile="bc1",threshold="MaxOffsetThreshold"} 100
      openshift_ptp_threshold{node="helix49.ptp.lab.eng.bos.redhat.com",profile="bc1",threshold="MinOffsetThreshold"} -100
      openshift_ptp_threshold{node="helix49.ptp.lab.eng.bos.redhat.com",profile="slave1",threshold="HoldOverTimeout"} 5
      openshift_ptp_threshold{node="helix49.ptp.lab.eng.bos.redhat.com",profile="slave1",threshold="MaxOffsetThreshold"} 100
      openshift_ptp_threshold{node="helix49.ptp.lab.eng.bos.redhat.com",profile="slave1",threshold="MinOffsetThreshold"} -100
      openshift_ptp_threshold{node="helix49.ptp.lab.eng.bos.redhat.com",profile="slave2",threshold="HoldOverTimeout"} 5
      openshift_ptp_threshold{node="helix49.ptp.lab.eng.bos.redhat.com",profile="slave2",threshold="MaxOffsetThreshold"} 100
      openshift_ptp_threshold{node="helix49.ptp.lab.eng.bos.redhat.com",profile="slave2",threshold="MinOffsetThreshold"} -100
      openshift_ptp_threshold{node="helix49.ptp.lab.eng.bos.redhat.com",profile="slave3",threshold="HoldOverTimeout"} 5
      openshift_ptp_threshold{node="helix49.ptp.lab.eng.bos.redhat.com",profile="slave3",threshold="MaxOffsetThreshold"} 100
      openshift_ptp_threshold{node="helix49.ptp.lab.eng.bos.redhat.com",profile="slave3",threshold="MinOffsetThreshold"} -100
      # HELP promhttp_metric_handler_requests_in_flight Current number of scrapes being served.
      # TYPE promhttp_metric_handler_requests_in_flight gauge
      promhttp_metric_handler_requests_in_flight 1
      # HELP promhttp_metric_handler_requests_total Total number of scrapes by HTTP status code.
      # TYPE promhttp_metric_handler_requests_total counter
      promhttp_metric_handler_requests_total{code="200"} 36
      promhttp_metric_handler_requests_total{code="500"} 0
      promhttp_metric_handler_requests_total{code="503"} 0
      [root@helix49 /]# 
      [root@helix49 /]# ps -ef | grep phc
      root       75182   10104  0 00:03 ?        00:00:00 /usr/sbin/phc2sys -s eno12399 -w -r -m -n 24 -N 8 -R 16 -z /var/run/ptp4l.1.socket -t [ptp4l.1.config]
      root      115217  109441  0 00:09 pts/0    00:00:00 grep --color=auto phc
       

      Expected results:

      old phc2sys metics associated with ptp4l.2.config should not appear in ptp metrics.

      Additional info:

      Workaround: restart ptp linux daemon pod will put the metrics back to a clean state.

              aputtur@redhat.com Aneesh Puttur
              rhn-support-yliu1 Yang Liu
              Ofer Bochan Ofer Bochan
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated:
                Resolved: