Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-56610

Cinder CSI driver not able to create pvc from snapshot when there are multiple compute and volume (matching) AZs

XMLWordPrintable

    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • Important
    • Yes
    • None
    • None
    • Rejected
    • ShiftStack Sprint 272, ShiftStack Sprint 273, ShiftStack Sprint 274, ShiftStack Sprint 275, ShiftStack Sprint 277, ShiftStack Sprint 278
    • 6
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Description of problem:

          [shift-on-stack] Cinder CSI driver is not able to create pvcs from volume snapshots when there are multiple (matching) volume and compute AZs:
      "Failed to CreateVolume: Expected HTTP response code [202] when accessing [POST https://overcloud.redhat.local:13776/v3/668259885d1d4e538fb1f071f52a6309/volumes], but got 400 instead:
      {"badRequest": {"code": 400, "message": "Invalid input received: Volume must be in the same availability zone as the snapshot"}}"

      Version-Release number of selected component (if applicable):

          4.19.0-0.nightly-2025-05-17-191114

      How reproducible:

          Always

      Steps to Reproduce:

      > Openstack deployment (17.1.5) with multiple (matching) volume and compute AZs:

      $ openstack hypervisor list
      +--------------------------------------+------------------------+-----------------+--------------+-------+
      | ID                                   | Hypervisor Hostname    | Hypervisor Type | Host IP      | State |
      +--------------------------------------+------------------------+-----------------+--------------+-------+
      | ef78d8de-7023-4e07-8974-fe56ac2db96c | compute-1.redhat.local | QEMU            | 172.17.1.180 | up    |
      | d06b5964-098b-4ec2-a09e-6b51c76f210d | compute-2.redhat.local | QEMU            | 172.17.1.149 | up    |
      | c32155e6-2f7f-447d-abb8-092e370b293e | compute-0.redhat.local | QEMU            | 172.17.1.140 | up    |
      +--------------------------------------+------------------------+-----------------+--------------+-------+
      
      $ openstack availability zone list --compute --long
      +-----------+-------------+---------------+---------------------------+----------------+----------------------------------------+
      | Zone Name | Zone Status | Zone Resource | Host Name                 | Service Name   | Service Status                         |
      +-----------+-------------+---------------+---------------------------+----------------+----------------------------------------+
      | AZ-0      | available   |               | compute-0.redhat.local    | nova-compute   | enabled :-) 2025-05-22T11:09:20.000000 |
      | AZ-1      | available   |               | compute-1.redhat.local    | nova-compute   | enabled :-) 2025-05-22T11:09:19.000000 |
      | AZ-2      | available   |               | compute-2.redhat.local    | nova-compute   | enabled :-) 2025-05-22T11:09:21.000000 |
      | ...                                                                                                                           |
      +-----------+-------------+---------------+---------------------------+----------------+----------------------------------------+
      
      $ openstack availability zone list --volume
      +-----------+-------------+
      | Zone Name | Zone Status |
      +-----------+-------------+
      | nova      | available   |
      | AZ-0      | available   |
      | AZ-1      | available   |
      | AZ-2      | available   |
      +-----------+-------------+
      
      $ openstack server list --os-cloud overcloud --all-projects  --fit-width                                                                                                                                    
      +--------------------------------------+-----------------------------+--------+--------------------------------------------------------------------------------------------------------+--------------------------+--------+
      | ID                                   | Name                        | Status | Networks                                                                                               | Image                    | Flavor |  
      +--------------------------------------+-----------------------------+--------+--------------------------------------------------------------------------------------------------------+--------------------------+--------+  
      | 6980d9f8-c1fd-403e-887d-4409c1748011 | ostest-jwtj5-master-nqx6f-2 | ACTIVE | StorageNFS=172.17.5.206; k8s-clusterapi-cluster-openshift-cluster-api-guests-ostest-jwtj5=10.196.0.228 | N/A (booted from volume) | master |  
      | 92727cee-0294-4200-a84b-30c98f14e686 | ostest-jwtj5-master-wdnvp-1 | ACTIVE | StorageNFS=172.17.5.199; k8s-clusterapi-cluster-openshift-cluster-api-guests-ostest-jwtj5=10.196.1.164 | N/A (booted from volume) | master |  
      | 0b78f2fe-9c5d-40b3-9ed8-0b71ebce2514 | ostest-jwtj5-master-mzvps-0 | ACTIVE | StorageNFS=172.17.5.210; k8s-clusterapi-cluster-openshift-cluster-api-guests-ostest-jwtj5=10.196.2.225 | N/A (booted from volume) | master |  
      | 09f8fe19-8b21-4737-afe4-9986e4be236b | ostest-jwtj5-worker-0-tlmdh | ACTIVE | StorageNFS=172.17.5.234; k8s-clusterapi-cluster-openshift-cluster-api-guests-ostest-jwtj5=10.196.1.36  | N/A (booted from volume) | worker |  
      | 1c0fd021-e5ad-445e-b491-1a65fefd5cb3 | ostest-jwtj5-worker-2-ffzsm | ACTIVE | StorageNFS=172.17.5.193; k8s-clusterapi-cluster-openshift-cluster-api-guests-ostest-jwtj5=10.196.1.39  | N/A (booted from volume) | worker |  
      | cd30159a-59bb-4c8b-ab28-ad04598e55fb | ostest-jwtj5-worker-1-z79sv | ACTIVE | StorageNFS=172.17.5.246; k8s-clusterapi-cluster-openshift-cluster-api-guests-ostest-jwtj5=10.196.0.147 | N/A (booted from volume) | worker |  
      | 62bb4959-5b27-47ad-86e7-0edfc69ba675 | ostest-jwtj5-worker-0-sm8gz | ACTIVE | StorageNFS=172.17.5.211; k8s-clusterapi-cluster-openshift-cluster-api-guests-ostest-jwtj5=10.196.0.148 | N/A (booted from volume) | worker |  
      +--------------------------------------+-----------------------------+--------+--------------------------------------------------------------------------------------------------------+--------------------------+--------+
      
      $ oc -n openshift-cluster-csi-drivers get cm cloud-conf -o json | jq .data.enable_topology
      "true"
      

      > Create a PVC:

      $ cat <<EOF | oc apply -f -
      ---
      apiVersion: project.openshift.io/v1
      kind: Project
      metadata:
        name: cinderpvc
        labels:
          kubernetes.io/metadata.name: cinderpvc
      ---
      apiVersion: v1
      kind: PersistentVolumeClaim
      metadata:
        name: "pvc-1"
        namespace: "cinderpvc"
      spec:
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: 1Gi
        #storageClassName: (default)
      ---
      apiVersion: apps/v1
      kind: Deployment
      metadata:
        name: demo-1
        namespace: "cinderpvc"
      spec:
        replicas: 1
        selector:
          matchLabels:
            app: demo-1
        template:
          metadata:
            labels:
              app: demo-1
          spec:
            containers:
            - name: demo
              image: quay.io/kuryr/demo
              ports:
              - containerPort: 80
                protocol: TCP
              volumeMounts:
                - mountPath: /var/lib/www/data
                  name: mydata
            volumes:
              - name: mydata
                persistentVolumeClaim:
                  claimName: pvc-1
                  readOnly: false
      EOF
      
      $ oc -n cinderpvc get pod -o wide
      NAME                      READY   STATUS    RESTARTS   AGE   IP             NODE                          NOMINATED NODE   READINESS GATES
      demo-1-587bcc46b7-gjfhp   1/1     Running   0          49s   10.130.3.190   ostest-jwtj5-worker-0-tlmdh   <none>           <none>
      
      $ oc -n cinderpvc get pvc
      NAME    STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   VOLUMEATTRIBUTESCLASS   AGE
      pvc-1   Bound    pvc-42649476-9084-4a4d-b50f-002fdf903d72   1Gi        RWO            standard-csi   <unset>                 57s
      
      $ openstack volume list                                                                                                                                                                                   
      +--------------------------------------+------------------------------------------+-----------+------+------------------------------------------------------+
      | ID                                   | Name                                     | Status    | Size | Attached to                                          |
      +--------------------------------------+------------------------------------------+-----------+------+------------------------------------------------------+
      | a34f33c4-5877-4ed6-ae9f-d2a4956d36fb | pvc-42649476-9084-4a4d-b50f-002fdf903d72 | in-use    |    1 | Attached to ostest-jwtj5-worker-0-tlmdh on /dev/vdd  |
      ...
      +--------------------------------------+------------------------------------------+-----------+------+------------------------------------------------------+
      

      > Create a VolumeSnapshotClass:

      $ cat <<EOF | oc apply -f -
      ---
      apiVersion: snapshot.storage.k8s.io/v1
      kind: VolumeSnapshotClass
      metadata:
        name: cinderpvc-snapclass
        namespace: cinderpvc
      driver: cinder.csi.openstack.org
      deletionPolicy: Delete
      parameters:
        force-create: "true"
      EOF
      
      $ oc get VolumeSnapshotClass
      NAME                         DRIVER                     DELETIONPOLICY   AGE
      cinderpvc-snapclass          cinder.csi.openstack.org   Delete           13s
      csi-manila-standard          manila.csi.openstack.org   Delete           4d7h
      standard-csi                 cinder.csi.openstack.org   Delete           4d7h
      

      > Create a VolumeSnapshot:

      $ cat <<EOF | oc apply -f -
      ---
      apiVersion: snapshot.storage.k8s.io/v1
      kind: VolumeSnapshot
      metadata:
        name: pvc-1-snap
        namespace: cinderpvc
      spec:
        volumeSnapshotClassName: cinderpvc-snapclass
        source:
          persistentVolumeClaimName: pvc-1
      EOF
      
      $ oc -n cinderpvc get volumesnapshot
      NAME         READYTOUSE   SOURCEPVC   SOURCESNAPSHOTCONTENT   RESTORESIZE   SNAPSHOTCLASS         SNAPSHOTCONTENT                                    CREATIONTIME   AGE
      pvc-1-snap   true         pvc-1                               1Gi           cinderpvc-snapclass   snapcontent-b87b592e-9bfc-4698-a737-95275fd8a17a   9s             10s
      
      
      $ oc -n cinderpvc get volumesnapshotcontent
      NAME                                               READYTOUSE   RESTORESIZE   DELETIONPOLICY   DRIVER                     VOLUMESNAPSHOTCLASS   VOLUMESNAPSHOT   VOLUMESNAPSHOTNAMESPACE   AGE
      snapcontent-b87b592e-9bfc-4698-a737-95275fd8a17a   true         1073741824    Delete           cinder.csi.openstack.org   cinderpvc-snapclass   pvc-1-snap       cinderpvc                 29s
      
      
      $ oc -n cinderpvc describe volumesnapshot pvc-1-snap
      Name:         pvc-1-snap
      Namespace:    cinderpvc
      Labels:       <none>
      Annotations:  <none>
      API Version:  snapshot.storage.k8s.io/v1
      Kind:         VolumeSnapshot
      Metadata:
        Creation Timestamp:  2025-05-22T09:28:28Z
        Finalizers:
          snapshot.storage.kubernetes.io/volumesnapshot-as-source-protection
          snapshot.storage.kubernetes.io/volumesnapshot-bound-protection
        Generation:        1
        Resource Version:  1956985
        UID:               b87b592e-9bfc-4698-a737-95275fd8a17a
      Spec:
        Source:
          Persistent Volume Claim Name:  pvc-1
        Volume Snapshot Class Name:      cinderpvc-snapclass
      Status:
        Bound Volume Snapshot Content Name:  snapcontent-b87b592e-9bfc-4698-a737-95275fd8a17a
        Creation Time:                       2025-05-22T09:28:29Z
        Ready To Use:                        true
        Restore Size:                        1Gi
      Events:
        Type    Reason            Age   From                 Message
        ----    ------            ----  ----                 -------
        Normal  CreatingSnapshot  40s   snapshot-controller  Waiting for a snapshot cinderpvc/pvc-1-snap to be created by the CSI driver.
        Normal  SnapshotCreated   38s   snapshot-controller  Snapshot cinderpvc/pvc-1-snap was successfully created by the CSI driver.
        Normal  SnapshotReady     38s   snapshot-controller  Snapshot cinderpvc/pvc-1-snap is ready to use.
      

       

      > Create a new pvc from pvc-1-snap snapshot:

      $ cat <<EOF | oc apply -f -
      ---
      apiVersion: v1
      kind: PersistentVolumeClaim
      metadata:
        name: pvc-1-restore
        namespace: cinderpvc
      spec:
        #storageClassName: (default)
        dataSource:
          name: pvc-1-snap 
          kind: VolumeSnapshot 
          apiGroup: snapshot.storage.k8s.io 
        accessModes:
          - ReadWriteOnce
        resources:
          requests:
            storage: 1Gi
      EOF
      
      $ oc get pvc
      NAME            STATUS    VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   VOLUMEATTRIBUTESCLASS   AGE
      pvc-1           Bound     pvc-42649476-9084-4a4d-b50f-002fdf903d72   1Gi        RWO            standard-csi   <unset>                 7m35s
      pvc-1-restore   Pending                                                                        standard-csi   <unset>                 3s
      
      $ openstack volume list                                                                                                                                                                                   
      +--------------------------------------+------------------------------------------+-----------+------+------------------------------------------------------+
      | ID                                   | Name                                     | Status    | Size | Attached to                                          |
      +--------------------------------------+------------------------------------------+-----------+------+------------------------------------------------------+
      | a34f33c4-5877-4ed6-ae9f-d2a4956d36fb | pvc-42649476-9084-4a4d-b50f-002fdf903d72 | in-use    |    1 | Attached to ostest-jwtj5-worker-0-tlmdh on /dev/vdd  |
      ...
      +--------------------------------------+------------------------------------------+-----------+------+------------------------------------------------------+
      

      > Attach to new pvc pvc-1-restore to the deployment, and dettach the previous one pvc-1:

      $ oc -n cinderpvc edit deployment.apps/demo-1
      
      Replace:
            volumes:
            - name: mydata
              persistentVolumeClaim:
                claimName: pvc-1
      
      by:
            volumes:
            - name: mydata
              persistentVolumeClaim:
                claimName: pvc-1-restore
                
      $ oc get pod
      NAME                      READY   STATUS    RESTARTS   AGE
      demo-1-587bcc46b7-gjfhp   1/1     Running   0          8m52s
      demo-1-659c55749d-5xq2c   0/1     Pending   0          23s
      
      $ oc get pvc
      NAME            STATUS    VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   VOLUMEATTRIBUTESCLASS   AGE
      pvc-1           Bound     pvc-42649476-9084-4a4d-b50f-002fdf903d72   1Gi        RWO            standard-csi   <unset>                 9m38s
      pvc-1-restore   Pending                                                                        standard-csi   <unset>                 2m6s
      
      
      $ oc get pvc pvc-1-restore -o yaml
      apiVersion: v1
      kind: PersistentVolumeClaim
      metadata:
        annotations:
          kubectl.kubernetes.io/last-applied-configuration: |
            {"apiVersion":"v1","kind":"PersistentVolumeClaim","metadata":{"annotations":{},"name":"pvc-1-restore","namespace":"cinderpvc"},"spec":{"accessModes":["ReadWriteOnce"],"dataSource":{"apiGroup":"snapshot.storage.k8s.io","kind":"Volume
      Snapshot","name":"pvc-1-snap"},"resources":{"requests":{"storage":"1Gi"}}}}
          volume.beta.kubernetes.io/storage-provisioner: cinder.csi.openstack.org
          volume.kubernetes.io/selected-node: ostest-jwtj5-worker-1-z79sv
          volume.kubernetes.io/storage-provisioner: cinder.csi.openstack.org
        creationTimestamp: "2025-05-22T09:30:22Z"
        finalizers:
        - kubernetes.io/pvc-protection
        name: pvc-1-restore
        namespace: cinderpvc
        resourceVersion: "1958116"
        uid: 85a6a216-dccf-42d1-9724-d0eafcbc7602
      spec:
        accessModes:
        - ReadWriteOnce
        dataSource:
          apiGroup: snapshot.storage.k8s.io
          kind: VolumeSnapshot
          name: pvc-1-snap
        dataSourceRef:
          apiGroup: snapshot.storage.k8s.io
          kind: VolumeSnapshot
          name: pvc-1-snap
        resources:
          requests:
            storage: 1Gi
        storageClassName: standard-csi
        volumeMode: Filesystem
      status:
        phase: Pending
      

      Logs from cinder-csi-driver-controller:

      I0522 11:35:52.694711       1 util.go:151] detected AZ from the topology: AZ-1                                                                                                                                                               
      E0522 11:35:52.900683       1 controllerserver.go:192] Failed to CreateVolume: Expected HTTP response code [202] when accessing [POST https://overcloud.redhat.local:13776/v3/668259885d1d4e538fb1f071f52a6309/volumes], but got 400 instead:
      {"badRequest": {"code": 400, "message": "Invalid input received: Volume must be in the same availability zone as the snapshot"}}                                                                                                             
      E0522 11:35:52.900750       1 utils.go:95] [ID:29719] GRPC error: rpc error: code = Internal desc = CreateVolume failed with error Expected HTTP response code [202] when accessing [POST https://overcloud.redhat.local:13776/v3/668259885d1d
      4e538fb1f071f52a6309/volumes], but got 400 instead: {"badRequest": {"code": 400, "message": "Invalid input received: Volume must be in the same availability zone as the snapshot"}}
      

      Actual results:

          PVC is pending, PV and volume not created.

      Expected results:

          PVC, PV and volume to be created.

      Additional info:

      Cinder-csi tests failing:

              sfinucan@redhat.com Stephen Finucane
              juriarte@redhat.com Jon Uriarte
              None
              None
              Jon Uriarte Jon Uriarte
              None
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated: