Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-57001

Prometheus volume affinity conflict resulting in pending pod

XMLWordPrintable

    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Description of problem:

          A pod, specifically prometheus-k8s-0 within the openshift-monitoring namespace, is unable to start and remains in a Pending state. The primary error message is 0/3 nodes are available: 3 node(s) had volume node affinity conflict.
      
      Investigation reveals that the PersistentVolumeClaim (PVC) prometheus-data-prometheus-k8s-0 is annotated with volume.kubernetes.io/selected-node: ip-10-51-2-129.us-west-2.compute.internal. However, this node (ip-10-51-2-129.us-west-2.compute.internal) no longer exists in the cluster.
      
      The associated PersistentVolume (PV pvc-37f5792d-eff9-4304-8dcc-90215ff84f0f), which is an AWS EBS volume, retains an implicit or explicit node affinity to this non-existent node (likely to its original Availability Zone). Consequently, none of the currently available worker nodes (3 in this case, all located in us-west-2c) can satisfy the volume's node affinity requirements, leading to the scheduling failure. This effectively means the EBS volume is "leaked" or orphaned from the perspective of the active cluster nodes.

      Version-Release number of selected component (if applicable):

          4.14.30

      How reproducible:

          Unknown

      Steps to Reproduce:

          1.Unknown
          2.
          3.
          

      Actual results:

          Prometheus pod is stuck pending because of what seems like a previously leaked PVC

      Expected results:

          Prometheus pod is able to schedule, PVC is in the right AZ / right node

      Additional info:

      Must-gather attached (see comments).  Could be related to CAPI's handling of node deletions (skipping volumes at some point), but unclear as it seems to be infrequent and no testing was able to reproduce this.    
      
      WORKAROUND: 
      - Delete the PVC & the pod for the pvc to be created on a new existing node.

              Unassigned Unassigned
              cbusse.openshift Claudio Busse
              None
              None
              Yu Li Yu Li
              None
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

                Created:
                Updated: