Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-33581

Infra is not usually labeled in capacity_cpu_core

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done-Errata
    • Icon: Critical Critical
    • 4.13.z
    • 4.12
    • Monitoring
    • Important
    • No
    • MON Sprint 253
    • 1
    • False
    • Hide

      None

      Show
      None
    • Hide
      * Before this update, for the `cluster:capacity_cpu_cores:sum` metric, nodes with the`infra` role but not `master` role were not assigned a value of `infra` for the `label_node_role_kubernetes_io` label. With this update, nodes with the `infra` role but not `master` role are now correctly labeled as `infra` for this metric. link:https://issues.redhat.com/browse/OCPBUGS-33581[OCPBUGS-33581]
      Show
      * Before this update, for the `cluster:capacity_cpu_cores:sum` metric, nodes with the`infra` role but not `master` role were not assigned a value of `infra` for the `label_node_role_kubernetes_io` label. With this update, nodes with the `infra` role but not `master` role are now correctly labeled as `infra` for this metric. link: https://issues.redhat.com/browse/OCPBUGS-33581 [ OCPBUGS-33581 ]
    • Bug Fix
    • In Progress

      This is a clone of issue OCPBUGS-10387. The following is the description of the original issue:

      Description of problem:

      In the metric `cluster:capacity_cpu_cores:sum` there is an attribute label `label_node_role_kubernetes_io` that has `infra` or `master`. There is no label for `worker`. If the infra nodes are missing this label, they get added into the "unlabeled" worker nodes. 
      
      For example:
      This cluster has all three types `cluster:capacity_cpu_cores:sum{_id="0702a3b1-c2d8-427f-865d-3ce7dc3a2be7"}`
      
      But this cluster has the infra and worker merged. `cluster:capacity_cpu_cores:sum{_id="0e60ac76-d61a-4e6d-a4f3-269110b6b1f9"}`
      
      
      If I count clusters that have sockets with infra but capacity_cpu without infra, I get 7,617 cluster for 2023-03-15
      
      If I count clusters that have sockets with infra but capacity_cpu with infra, I get 2,015 cluster for 2023-03-15
      
      That means that there are 5602 clusters that are missing the infra label. 
      
      This metric is used to identify the vCPU/CPU count that is used in TeleSense. This is presented to the Sales teams and upper management. If there is another metric we should use, please let me know. Otherwise, this needs to be fixed. 
      
      

      Version-Release number of selected component (if applicable):

       

      How reproducible:

       

      Steps to Reproduce:

      1.
      2.
      3.
      

      Actual results:

       

      Expected results:

       

      Additional info:

      refer to Slack thread: https://redhat-internal.slack.com/archives/C0VMT03S5/p1678967355450719

            jfajersk@redhat.com Jan Fajerski
            openshift-crt-jira-prow OpenShift Prow Bot
            Junqi Zhao Junqi Zhao
            Brian Burt Brian Burt
            Brian Burt
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

              Created:
              Updated:
              Resolved: