Uploaded image for project: 'OpenShift Virtualization'
  1. OpenShift Virtualization
  2. CNV-44918

Gaps in metrics graph when kubevirt-cluster-critical is deleted and recreated

XMLWordPrintable

    • 2
    • False
    • Hide

      None

      Show
      None
    • False
    • None
    • ---
    • ---
    • 3
    • CNV I/U Operators Sprint 258, CNV I/U Operators Sprint 262
    • None

      Description of problem:

      when trying to update componet priority_class value it fails often 
      in T2 we do have tests that verify metrics "kubevirt_hco_out_of_band_modifications_total{component_name="priorityclass/kubevirt-cluster-critical"}"
      while some failuers have been saw recentley because the metrics show None value 

      "kubevirt_hco_out_of_band_modifications_total" is a counter ut cannot go back to zero or to "none" - see attached image)

       

      Version-Release number of selected component (if applicable):

      cnv 4.17, 4.16, 4.15, 4.14

      How reproducible:

      around 50% of the test attempts 

      Steps to Reproduce:

      according to test "test_metric_invalid_change" in T2
      
      1.updated_resource_with_invalid_label 
      "priority_class": {
          "resource_info": {
              "comp_name": "priorityclass/kubevirt-cluster-critical",
              "name": "kubevirt-cluster-critical",
              "resource": PriorityClass,
              "count": COUNT_TWO,
          },
      },
      
      
      2. check metrics: kubevirt_hco_out_of_band_modifications_total{component_name="priorityclass/kubevirt-cluster-critical"}
      or oc log hco-operator-77cf9b5794-9z75c -n openshift-cnv 

      or  can simply run test "test_metric_invalid_change" muiltiple time (2-3 times) on PSI 

       

      Actual results:

      in the diagram we can see that there was time's where the metrics show "None" value 

       

      Expected results:

      kubevirt_hco_out_of_band_modifications_total{component_name="priorityclass/kubevirt-cluster-critical"} should not show None value after the resource is updated 

      Additional info:

      oc log hco-operator-77cf9b5794-9z75c -n openshift-cnv   {"level":"error","ts":"2024-07-22T15:59:48Z","logger":"controller_hyperconverged","msg":"failed to ensure an operand","Request.Namespace":"openshift-cnv","Request.Name":"kubevirt-hyperconverged","error":"PriorityClass.scheduling.k8s.io \"kubevirt-cluster-critical\" not found","stacktrace":"github.com/kubevirt/hyperconverged-cluster-operator/controllers/operands.(*OperandHandler).Ensure\n\t/remote-source/app/controllers/operands/operandHandler.go:150\ngithub.com/kubevirt/hyperconverged-cluster-operator/controllers/hyperconverged.(*ReconcileHyperConverged).EnsureOperandAndComplete\n\t/remote-source/app/controllers/hyperconverged/hyperconverged_controller.go:516\ngithub.com/kubevirt/hyperconverged-cluster-operator/controllers/hyperconverged.(*ReconcileHyperConverged).doReconcile\n\t/remote-source/app/controllers/hyperconverged/hyperconverged_controller.go:465\ngithub.com/kubevirt/hyperconverged-cluster-operator/controllers/hyperconverged.(*ReconcileHyperConverged).Reconcile\n\t/remote-source/app/controllers/hyperconverged/hyperconverged_controller.go:330\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:119\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:316\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:227"}
      {"level":"info","ts":"2024-07-22T15:59:48Z","logger":"controller_hyperconverged","msg":"setting the Upgradeable operator condition","Request.Namespace":"openshift-cnv","Request.Name":"kubevirt-hyperconverged","requested status":true}
      {"level":"info","ts":"2024-07-22T15:59:48Z","logger":"controller_hyperconverged","msg":"Reconciling for *v1.PriorityClass"}
       

            rh-ee-ahafe Ahmad Hafi
            rh-ee-ahafe Ahmad Hafi
            Natalie Gavrielov Natalie Gavrielov
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated: