Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-17853

expectedVGCount / readyVGCount mismatch in MultiNode Clusters fails Readiness

XMLWordPrintable

    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • 1
    • Low
    • No
    • None
    • None
    • OCP VE Sprint 240, OCP VE Sprint 241
    • 2
    • In Progress
    • Release Note Not Required
    • None
    • None
    • None
    • None
    • None

      Description of problem:

      Whenever starting a cluster with multiple nodes and trying to attach multiple devices to them, the Cluster does not become ready.
      
      In this case all worker nodes have 2 loop devices with 3GB Block Storage attached. All the VolumeGroupNodeStatus Objects show as ready.
      
      apiVersion: lvm.topolvm.io/v1alpha1
      kind: LVMCluster
      metadata:
        annotations:
          kubectl.kubernetes.io/last-applied-configuration: |
            {"apiVersion":"lvm.topolvm.io/v1alpha1","kind":"LVMCluster","metadata":{"annotations":{},"name":"my-lvmcluster","namespace":"openshift-storage"},"spec":{"storage":{"deviceClasses":[{"default":true,"fstype":"xfs","name":"vg1","thinPoolConfig":{"name":"thin-pool-1","overprovisionRatio":10,"sizePercent":90}}]}}}
        creationTimestamp: "2023-08-16T08:24:37Z"
        finalizers:
        - lvmcluster.topolvm.io
        generation: 1
        name: my-lvmcluster
        namespace: openshift-storage
        resourceVersion: "46967"
        uid: fd5c50cd-c2d0-453d-828e-37dae45ddc38
      spec:
        storage:
          deviceClasses:
          - default: true
            fstype: xfs
            name: vg1
            thinPoolConfig:
              name: thin-pool-1
              overprovisionRatio: 10
              sizePercent: 90
      status:
        deviceClassStatuses:
        - name: vg1
          nodeStatus:
          - devices:
            - /dev/loop0
            - /dev/loop1
            node: ip-10-0-144-179.us-east-2.compute.internal
            status: Ready
          - devices:
            - /dev/loop0
            - /dev/loop1
            node: ip-10-0-168-219.us-east-2.compute.internal
            status: Ready
          - devices:
            - /dev/loop0
            - /dev/loop1
            node: ip-10-0-240-151.us-east-2.compute.internal
            status: Ready
        state: Progressing
      
      

      Version-Release number of selected component (if applicable):

      4.13-4.15
      

      How reproducible:

      100%
      

      Steps to Reproduce:

      1. Create a Multi Node Cluster with 3 worker nodes
      2. Attach Loop Storage to them: 
      3. Create an LVM Cluster as seen above with a generic xfs formatting and a greedy lookup for devices (no deviceSelector)
      4. Observer the Cluster does not get into ready even though all components are Ready.
      

      Actual results:

      
      Cluster does not get ready.
      When injecting a log message on the readiness check one can see in comes from the VG comparison:
      {"level":"info","ts":"2023-08-16T08:24:40Z","logger":"lvmcluster-controller","msg":"Verifying readiness","Request.Name":"my-lvmcluster","Request.Namespace":"openshift-storage","expectedVGCount":6,"readyVGCount":3}   
      
      For some reason, the expectedVGCount is 6 while readyVGCount is only 3.
      
      

      Expected results:

      
      Cluster becomes ready and VGCounts match
      
      

      Additional info:

      
      

              rh-ee-jmoller Jakob Moeller (Inactive)
              rh-ee-jmoller Jakob Moeller (Inactive)
              None
              None
              None
              None
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated:
                Resolved: