Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-65723

[release-4.19] IngressOperator not exposing some metrics for existing IngressController after Operator restart

    • None
    • False
    • Hide

      None

      Show
      None
    • None
    • Moderate
    • None
    • None
    • Rejected
    • NI&D Sprint 280, NI&D Sprint 281
    • 2
    • In Progress
    • Bug Fix
    • Hide
      Before this update, Cluster Ingress Operator pod restarts with existing `IngressController` resources in `Available` or `Degraded` status causing the `ingress_controller_conditions` metric to disappear from the Operator's `/metrics` endpoint. As a result, users were unable to monitor the `IngressController` status following a pod restart. With this release, the `IngressControllerConditions` metric is now set during every reconciliation cycle, regardless of whether an Ingress controller status update occurred, ensuring reliable and continuous monitoring of the `IngressController` health. (link:https://issues.redhat.com/browse/OCPBUGS-65723[OCPBUGS-65723])
      Show
      Before this update, Cluster Ingress Operator pod restarts with existing `IngressController` resources in `Available` or `Degraded` status causing the `ingress_controller_conditions` metric to disappear from the Operator's `/metrics` endpoint. As a result, users were unable to monitor the `IngressController` status following a pod restart. With this release, the `IngressControllerConditions` metric is now set during every reconciliation cycle, regardless of whether an Ingress controller status update occurred, ensuring reliable and continuous monitoring of the `IngressController` health. (link: https://issues.redhat.com/browse/OCPBUGS-65723 [ OCPBUGS-65723 ])
    • None
    • None
    • None
    • None

      This is a clone of issue OCPBUGS-65664. The following is the description of the original issue:

      Description of problem:

      When an IngressOperator pod is restarted and there are already IngressController resources in Available or Degraded status, the ingress_controller_conditions metric is non-existent from the Operators /metrics endpoint.

      Version-Release number of selected component (if applicable):

          

      How reproducible:

      Create a cluster with at least the default IngressController
      Restart the Ingress Controller (it could happen if you need to force the cloud provider credentials to be updated after changing something in the IAM role)

      Steps to Reproduce:

          1.Make a cluster with the default ingress controller
          2.Perform an operation that will cause the cloud provider credential to be updated
          3.Restart the Operator pod to force the operator to use immediately the new credentials
          

      Actual results:

      The ingress_controller_conditions metric is not available.

      Expected results:

      The ingress_controller_conditions metric is available.    

      Additional info:

      This is a hypershift cluster, but I think it shouldn't matter.

              dsalerno@redhat.com Davide Salerno
              jbranham.openshift Josh Branham
              None
              None
              Shudi Li Shudi Li
              None
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated:
                Resolved: