Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-45058

PodStartupStorageOperationsFailing alert in RHOCP web console

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Major Major
    • None
    • 4.14.z, 4.15.z
    • Storage
    • Important
    • None
    • False
    • Hide

      None

      Show
      None

      Description of problem:

      Getting PodStartupStorageOperationsFailing alert for one node in the RHOCP web console. 
      Howver there are no pods stuck in ContainerCreating on that node or any other nodes in the cluster.
      ~~~
      # oc get pod -A -o wide | grep -i ContainerCreating
      #
      ~~~
      The cluster looks healthy.
      
      Silencing the alert worked here as described in the documentation : https://docs.openshift.com/container-platform/4.14/observability/monitoring/managing-alerts.html#silencing-alerts_managing-alerts
      
      Expectations from this BUG is to understand why such alert is getting generated.

      Version-Release number of selected component (if applicable):

          

      Actual results:

      PodStartupStorageOperationsFailing alert is getting triggered for one node

      Expected results:

      PodStartupStorageOperationsFailing should not be triggered for any node.

      Additional info:

      The expression of this alert is :
      increase(storage_operation_duration_seconds_count{operation_name=~"volume_attach|volume_mount",status!="success"}[5m]) > 0 and ignoring (status) increase(storage_operation_duration_seconds_count{operation_name=~"volume_attach|volume_mount",status="success"}[5m]) == 0
      

              rh-ee-mpatlaso Maxim Patlasov
              rhn-support-sdharma Suruchi Dharma
              Junqi Zhao Junqi Zhao
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated: