Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-10387

Infra is not usually labeled in capacity_cpu_core

    XMLWordPrintable

Details

    • Bug
    • Resolution: Done-Errata
    • Critical
    • 4.14.0
    • 4.12
    • Monitoring
    • Important
    • No
    • MON Sprint 237
    • 1
    • Rejected
    • False
    • Hide

      None

      Show
      None
    • Hide
      * Before this update, for the `cluster:capacity_cpu_cores:sum` metric, nodes with the`infra` role but not `master` role were not assigned a value of `infra` for the `label_node_role_kubernetes_io` label. With this update, nodes with the `infra` role but not `master` role are now correctly labeled as `infra` for this metric. link:https://issues.redhat.com/browse/OCPBUGS-10387[OCPBUGS-10387]
      Show
      * Before this update, for the `cluster:capacity_cpu_cores:sum` metric, nodes with the`infra` role but not `master` role were not assigned a value of `infra` for the `label_node_role_kubernetes_io` label. With this update, nodes with the `infra` role but not `master` role are now correctly labeled as `infra` for this metric. link: https://issues.redhat.com/browse/OCPBUGS-10387 [ OCPBUGS-10387 ]
    • Bug Fix
    • Done

    Description

      Description of problem:

      In the metric `cluster:capacity_cpu_cores:sum` there is an attribute label `label_node_role_kubernetes_io` that has `infra` or `master`. There is no label for `worker`. If the infra nodes are missing this label, they get added into the "unlabeled" worker nodes. 
      
      For example:
      This cluster has all three types `cluster:capacity_cpu_cores:sum{_id="0702a3b1-c2d8-427f-865d-3ce7dc3a2be7"}`
      
      But this cluster has the infra and worker merged. `cluster:capacity_cpu_cores:sum{_id="0e60ac76-d61a-4e6d-a4f3-269110b6b1f9"}`
      
      
      If I count clusters that have sockets with infra but capacity_cpu without infra, I get 7,617 cluster for 2023-03-15
      
      If I count clusters that have sockets with infra but capacity_cpu with infra, I get 2,015 cluster for 2023-03-15
      
      That means that there are 5602 clusters that are missing the infra label. 
      
      This metric is used to identify the vCPU/CPU count that is used in TeleSense. This is presented to the Sales teams and upper management. If there is another metric we should use, please let me know. Otherwise, this needs to be fixed. 
      
      

      Version-Release number of selected component (if applicable):

       

      How reproducible:

       

      Steps to Reproduce:

      1.
      2.
      3.
      

      Actual results:

       

      Expected results:

       

      Additional info:

      refer to Slack thread: https://redhat-internal.slack.com/archives/C0VMT03S5/p1678967355450719

      Attachments

        Issue Links

          Activity

            People

              jfajersk@redhat.com Jan Fajerski
              josh-6 Josh Wilson
              Junqi Zhao Junqi Zhao
              Brian Burt Brian Burt
              Brian Burt
              Votes:
              1 Vote for this issue
              Watchers:
              15 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: