Uploaded image for project: 'Red Hat OpenStack Services on OpenShift'
  1. Red Hat OpenStack Services on OpenShift
  2. OSPRH-13736

Unable to attach persistent volume to an instance

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Normal Normal
    • None
    • None
    • nova-operator
    • None
    • False
    • Hide

      None

      Show
      None
    • False
    • ?
    • None
    • Moderate

      After an upgrade from OSP16 to OSP17.1.4, customer is failing to attach persistent volumes to all instances.

       

      nova/nova-compute.log:2025-02-03 10:01:06.538 2 ERROR nova.virt.block_device [instance: 18b1af24-39f2-4055-af9a-ef6f2b5b9faa]     raise libvirtError('virDomainAttachDeviceFlags() failed')
      nova/nova-compute.log:2025-02-03 10:01:06.538 2 ERROR nova.virt.block_device [instance: 18b1af24-39f2-4055-af9a-ef6f2b5b9faa] libvirt.libvirtError: Requested operation is not valid: target sdf already exists

      The mentioned volume is in available state:

       

      (oscar14) [stack@oscar14dir001 ~]$ openstack volume show 691f1a27-def8-4323-bcc9-e13b96038067 --fit
      +--------------------------------+----------------------------------------------------------+
      | Field                          | Value                                                    |
      +--------------------------------+----------------------------------------------------------+
      | attachments                    | []                                                       |
      | availability_zone              | nova                                                     |
      | bootable                       | false                                                    |
      | consistencygroup_id            | None                                                     |
      | created_at                     | 2021-11-30T12:08:11.000000                               |
      | description                    | Created by OpenStack Cinder CSI driver                   |
      | encrypted                      | False                                                    |
      | id                             | 691f1a27-def8-4323-bcc9-e13b96038067                     |
      | migration_status               | None                                                     |
      | multiattach                    | False                                                    |
      | name                           | pvc-07fb9a16-b6a5-4fe4-93d9-38d0d18636a7                 |
      | os-vol-host-attr:host          | hostgroup@tripleo_ceph#tripleo_ceph                      |
      | os-vol-mig-status-attr:migstat | None                                                     |
      | os-vol-mig-status-attr:name_id | None                                                     |
      | os-vol-tenant-attr:tenant_id   | 37ec3de21f554480bfbca365bb9b849a                         |
      | properties                     | cinder.csi.openstack.org/cluster='prod-ocpcarc-01-qxcfk' |
      | replication_status             | None                                                     |
      | size                           | 26                                                       |
      | snapshot_id                    | None                                                     |
      | source_volid                   | None                                                     |
      | status                         | available                                                |
      | type                           | tripleo                                                  |
      | updated_at                     | 2025-02-03T12:16:50.000000                               |
      | user_id                        | 6acc6269aaf04b278df8e568b717eaa6                         |
      +--------------------------------+----------------------------------------------------------+ 

      And the output of 

      openstack server show

       for the instance does not contain the mentioned volume:

       

       

       

       

      | volumes_attached                    | delete_on_termination='False', id='6421414f-caf9-4b31-bdeb-a92d98b0ec93'                                                 |
      |                                     | delete_on_termination='False', id='c333227a-beb4-4f85-826a-c1ce6d1a1aab'                                                 |
      |                                     | delete_on_termination='False', id='6321eb81-0e29-4551-84e4-c6f6bad8ff7e'                                                 |
      |                                     | delete_on_termination='False', id='a41a61ba-cde4-4dfa-90e1-4f815f26ac6f'                                                 |
      |                                     | delete_on_termination='False', id='691f1a27-def8-4323-bcc9-e13b96038067'                                                 |
      +-------------------------------------+------------------------------------------------------------------------------------------------------------------ 

       

      sdf device changed the type from `scsi` to `sata` between 16.x and 17.1.4 . There's dumpxml discrepancy. In the output of `openstack server show ..` in the OS-EXT-SRV-ATTR:root_device_name field there's no /dev/sdf, only /dev/sda. Hence nova isn't aware of /dev/sdf. Despite that, kvm is trying to get the last device on the list(/dev/sdf) as it's still part of instance definition:

       

      sda is an ephemeral ceph disk:
      ~~~
          <disk type='network' device='disk'>
            <driver name='qemu' type='raw' cache='writeback' discard='unmap'/>
            <auth username='openstack'>
              <secret type='ceph' uuid='e3f27f3b-b76c-4505-9c35-fccd4a0e99a0'/>
            </auth>
            <source protocol='rbd' name='vms/bde2657d-e117-49dc-8f2f-42f4410132de_disk'>
              <host name='192.168.9.188' port='6789'/>
              <host name='192.168.10.207' port='6789'/>
              <host name='192.168.11.227' port='6789'/>
            </source>
            <target dev='sda' bus='scsi'/>
            <address type='drive' controller='0' bus='0' target='0' unit='0'/>
          </disk>
      ~~~
      
      sdf is cd-rom pointing to the ephemeral sda disk:
      ~~~
          <disk type='network' device='cdrom'>
            <driver name='qemu' type='raw' cache='writeback' discard='unmap'/>
            <auth username='openstack'>
              <secret type='ceph' uuid='e3f27f3b-b76c-4505-9c35-fccd4a0e99a0'/>
            </auth>
            <source protocol='rbd' name='vms/bde2657d-e117-49dc-8f2f-42f4410132de_disk.config'>
              <host name='192.168.9.188' port='6789'/>
              <host name='192.168.10.207' port='6789'/>
              <host name='192.168.11.227' port='6789'/>
            </source>
            <target dev='sdf' bus='sata'/>
            <readonly/>
            <address type='drive' controller='0' bus='0' target='0' unit='5'/>
          </disk>
      ~~~
       

      Sosreports with debug enabled for nova are attached to the case, as well as output of openstack server event list, server show, volume show etc. 

       

      Please note this is a prod environment.

      The system where the instance is running:

      [root@node-004 ~]# ls /dev/
      autofs           fb0           mcelog        rtc0  sg0              tty    tty20  tty33  tty46  tty59    uhid         vcsa1  vga_arbiter
      block            fd            mem           sda   sg1              tty0   tty21  tty34  tty47  tty6     uinput       vcsa2  vhci
      bsg              full          mqueue        sda1  sg2              tty1   tty22  tty35  tty48  tty60    urandom      vcsa3  vhost-net
      bus              fuse          net           sdb   sg3              tty10  tty23  tty36  tty49  tty61    usbmon0      vcsa4  vhost-vsock
      cdrom            hidraw0       null          sdb1  sg4              tty11  tty24  tty37  tty5   tty62    usbmon1      vcsa5  watchdog
      char             hpet          nvme-fabrics  sdb2  sg5              tty12  tty25  tty38  tty50  tty63    userfaultfd  vcsa6  watchdog0
      console          hugepages     nvram         sdb3  shm              tty13  tty26  tty39  tty51  tty7     vcs          vcsu   zero
      core             hwrng         port          sdb4  snapshot         tty14  tty27  tty4   tty52  tty8     vcs1         vcsu1
      cpu              initctl       ppp           sdc   snd              tty15  tty28  tty40  tty53  tty9     vcs2         vcsu2
      cpu_dma_latency  input         ptmx          sdc1  sr0              tty16  tty29  tty41  tty54  ttyS0    vcs3         vcsu3
      cuse             kmsg          pts           sdd   stderr           tty17  tty3   tty42  tty55  ttyS1    vcs4         vcsu4
      disk             log           random        sdd1  stdin            tty18  tty30  tty43  tty56  ttyS2    vcs5         vcsu5
      dma_heap         loop-control  rfkill        sde   stdout           tty19  tty31  tty44  tty57  ttyS3    vcs6         vcsu6
      dri              mapper        rtc           sde1  termination-log  tty2   tty32  tty45  tty58  udmabuf  vcsa         vfio
      [root@node-004 ~]# lsblk
      NAME   MAJ:MIN RM  SIZE RO TYPE MOUNTPOINTS
      sda      8:0    0  450G  0 disk
      └─sda1   8:1    0  450G  0 part /var/lib/containers
      sdb      8:16   0   60G  0 disk
      ├─sdb1   8:17   0    1M  0 part
      ├─sdb2   8:18   0  127M  0 part
      ├─sdb3   8:19   0  384M  0 part /boot
      └─sdb4   8:20   0 59.5G  0 part /var
                                      /sysroot/ostree/deploy/rhcos/var
                                      /usr
                                      /etc
                                      /
                                      /sysroot
      sdc      8:32   0   30G  0 disk
      └─sdc1   8:33   0   30G  0 part /var/lib/fluentd
      sdd      8:48   0   30G  0 disk
      └─sdd1   8:49   0   30G  0 part /var/log
      sde      8:64   0  100G  0 disk
      └─sde1   8:65   0  100G  0 part /var/lib/kubelet/pods/0c389be0-4b18-4d37-8a71-babe250ac5d8/volume-subpaths/entrypoint/collector/15
                                      /var/lib/kubelet/pods/b36876e4-acc3-4eb6-b0b9-50166908907b/volume-subpaths/pullcerts/twistlock-defender/8
                                      /var/lib/kubelet
      sr0     11:0    1  492K  0 rom
      [root@node-004 ~]# 

       

      ~~~
         Warning   FailedAttachVolume     pod/splunk-forwarder-app-0              AttachVolume.Attach failed for volume "pvc-07fb9a16-b6a5-4fe4-93d9-38d0d18636a7" : rpc error: code = Internal desc = [ControllerPublishVolume] failed to attach volume: Volume "691f1a27-def8-4323-bcc9-e13b96038067" failed to be attached within the alloted time
      ~~~
       

       

       

       

              mwitt@redhat.com melanie witt
              rhn-support-mhauryli Matsvei Hauryliuk
              rhos-dfg-compute
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

                Created:
                Updated: