Uploaded image for project: 'OCP Technical Release Team'
  1. OCP Technical Release Team
  2. TRT-987

Investigate worker CPU metrics in promecius

XMLWordPrintable

    • Icon: Story Story
    • Resolution: Done
    • Icon: Undefined Undefined
    • None
    • None
    • None
    • False
    • None
    • False

      In the fallout of OCPBUGS-11591, I am wondering if we can identify worker CPU metrics, and load up some of the bad runs linked in the jira and check what the worker CPU was on a node going bad. Bad nodes can be identified by looking for the "Unreasonably long xxxxxxms poll interval".

      Then compare what they look like now on good runs from this job: https://sippy.dptools.openshift.org/sippy-ng/jobs/4.14/runs?filters=%7B%22items%22%3A%5B%7B%22columnField%22%3A%22name%22%2C%22operatorValue%22%3A%22equals%22%2C%22value%22%3A%22periodic-ci-openshift-release-master-ci-4.14-e2e-gcp-ovn-upgrade%22%7D%5D%7D&sortField=timestamp&sort=desc

        1. image-2023-05-08-10-17-34-414.png
          130 kB
          Justin Pierce
        2. Screen Shot 2023-05-04 at 8.31.20 AM.png
          466 kB
          Dennis Periquet
        3. Screen Shot 2023-05-09 at 10.09.06 AM.png
          134 kB
          Dennis Periquet

              dperique@redhat.com Dennis Periquet
              rhn-engineering-dgoodwin Devan Goodwin
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated:
                Resolved: