Uploaded image for project: 'OpenShift Virtualization'
  1. OpenShift Virtualization
  2. CNV-62847

Frequent alerts triggering: HCO and its secondary resources are in a critical state.

XMLWordPrintable

    • CNV I/U Operators Sprint 274, CNV I/U Operators Sprint 275, CNV I/U Operators Sprint 276, CNV I/U Operators Sprint 277
    • Important
    • Customer Reported
    • None

      Description of problem:

      Below critical alerts getting generates frequently and resolves automatically in the cluster post upgrading.
      
      ~~~
       Name: 
      HCOOperatorConditionsUnhealthy 
      
      Description: 
      HCO and its secondary resources are in a critical state due to system error.
      
      Summary: 
      HCO and its secondary resources are in a critical state.
      
      Runbook: 
      https://github.com/openshift/runbooks/blob/master/alerts/openshift-virtualization-operator/HCOOperatorConditionsUnhealthy.md
      ~~~
      
      
      oc get svc | grep hyperconverged
      hyperconverged-cluster-cli-download                  ClusterIP   172.30.106.3     <none>        8080/TCP   166d
      kubevirt-hyperconverged-operator-metrics             ClusterIP   172.30.72.142    <none>        8383/TCP   166d                       <<===
      
      oc get ep | grep hyperconverged
      hyperconverged-cluster-cli-download                  10.129.0.87:8080                                                   166d
      kubevirt-hyperconverged-operator-metrics             10.129.0.85:8383                                                   166d                                              <<===
      
      get pods -n openshift-cnv | grep hco 
      hco-operator-fbfd47849-lkwkq                          1/1     Running   0          6d
      hco-webhook-d4ccf9588-wl8n8                           1/1     Running   0          6d
      
      The hco condition is normal and pods haven't restarted in a while. 
      
      oc get hco kubevirt-hyperconverged -n openshift-cnv -o yaml| yq .status.conditions
      [
        {
          "lastTransitionTime": "2025-05-21T07:54:01Z",
          "message": "Reconcile completed successfully",
          "observedGeneration": 11,
          "reason": "ReconcileCompleted",
          "status": "True",
          "type": "ReconcileComplete"
        },
        {
          "lastTransitionTime": "2025-05-29T04:18:53Z",
          "message": "Reconcile completed successfully",
          "observedGeneration": 11,
          "reason": "ReconcileCompleted",
          "status": "True",
          "type": "Available"
        },
        {
          "lastTransitionTime": "2025-05-29T04:18:53Z",
          "message": "Reconcile completed successfully",
          "observedGeneration": 11,
          "reason": "ReconcileCompleted",
          "status": "False",
          "type": "Progressing"
        },
        {
          "lastTransitionTime": "2025-05-29T04:18:53Z",
          "message": "Reconcile completed successfully",
          "observedGeneration": 11,
          "reason": "ReconcileCompleted",
          "status": "False",
          "type": "Degraded"
        },
        {
          "lastTransitionTime": "2025-05-29T04:18:53Z",
          "message": "Reconcile completed successfully",
          "observedGeneration": 11,
          "reason": "ReconcileCompleted",
          "status": "True",
          "type": "Upgradeable"
        }
      ]
       

      Version-Release number of selected component (if applicable):

      kubevirt-hyperconverged-operator.v4.18.3   OpenShift Virtualization         4.18.3                  kubevirt-hyperconverged-operator.v4.17.7   Succeeded
      

      How reproducible:

      100%

      Steps to Reproduce:

      1. Visible post upgrading cluster to 4.18.3
      

      Actual results:

      Alerts are triggering everyday or twice in while. 

      Expected results:

       

      Additional info:

      Alerts are also visible in test environment but not as frequent as customer and we are still monitoring. 

              sradco Shirly Radco
              rhn-support-ymotiyel Yash Motiyele
              Ohad Revah Ohad Revah
              Votes:
              0 Vote for this issue
              Watchers:
              13 Start watching this issue

                Created:
                Updated:
                Resolved: