-
Bug
-
Resolution: Not a Bug
-
Major
-
None
-
4.15
-
No
-
3
-
OCPEDGE Sprint 250
-
1
-
Proposed
-
False
-
Description of problem:
Found during CNV testing: We create a LUN disk, and it leaves LVMS disfunctional.
akalenyu's guess is that LVMS picks up faulty test disks as PVs
PVC stays Pending, PV is not created.
$ oc describe pvc prime-9819eed5-e3be-4ae7-bc3e-31a4da5e5240 .... Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal WaitForFirstConsumer 79s (x2 over 79s) persistentvolume-controller waiting for first consumer to be created before binding Normal WaitForPodScheduled 79s persistentvolume-controller waiting for pod importer-prime-9819eed5-e3be-4ae7-bc3e-31a4da5e5240 to be scheduled Normal ExternalProvisioning 11s (x7 over 77s) persistentvolume-controller Waiting for a volume to be created either by the external provisioner 'topolvm.io' or manually by the system administrator. If volume creation is delayed, please verify that the provisioner is running and correctly registered. Normal Provisioning 11s (x7 over 77s) topolvm.io_topolvm-controller-765c99856c-tk546_2c2d5fc1-2e78-4a05-a0ba-b22ac720e26f External provisioner is provisioning volume for claim "default/prime-9819eed5-e3be-4ae7-bc3e-31a4da5e5240" Warning ProvisioningFailed 10s (x7 over 77s) topolvm.io_topolvm-controller-765c99856c-tk546_2c2d5fc1-2e78-4a05-a0ba-b22ac720e26f failed to provision volume with StorageClass "lvms-vg1": rpc error: code = Internal desc = exit status 5
In the log, we see:
{"level":"info","ts":"2024-02-28T16:19:58Z","msg":"invoking LVM command","controller":"logicalvolume","controllerGroup":"topolvm.io","controllerKind":"LogicalVolume","LogicalVolume":{"name":"pvc-8a23930c-1d92-4c9a-844e-2b06af48ba4f"},"namespace":"","name":"pvc-8a23930c-1d92-4c9a-844e-2b06af48ba4f","reconcileID":"1809a914-4e99-4861-b538-35807bad326e","args":["lvcreate","-T","vg1/thin-pool-1","-n","041071d3-0074-45ac-b6d0-49558f97d225","-V","1073741824b","-W","y","-y"]} WARNING: Couldn't find device with uuid ohBUUc-R3Xh-oAbA-yGsl-rev7-Xmuv-4EeQJl. WARNING: Couldn't find device with uuid qfXeIs-aBc8-0eol-XPT9-BK45-GP1b-WNGusC. WARNING: Couldn't find device with uuid jWPS0x-N955-flpA-sjdR-ufvF-dSgg-0ajGaF. WARNING: Couldn't find device with uuid K7M9QZ-zEvp-hwJE-zRG2-4A2K-4lrw-l3ILE2. WARNING: Couldn't find device with uuid oMZ5Lm-xA03-IDe9-BIlM-JNcx-0n1M-RzTRj0. WARNING: Couldn't find device with uuid Zxzmcs-6SiS-tIUr-Gu4u-llk8-iG2N-qrxZ8u. WARNING: Couldn't find device with uuid zPV3eW-lf2M-7dWm-hyic-3CQt-f53d-FL1cG3. WARNING: Couldn't find device with uuid 4UAniO-KL3o-u1x5-ojsm-QVC8-tuBs-igI6ZF. WARNING: VG vg1 is missing PV ohBUUc-R3Xh-oAbA-yGsl-rev7-Xmuv-4EeQJl (last written to [unknown]). WARNING: VG vg1 is missing PV qfXeIs-aBc8-0eol-XPT9-BK45-GP1b-WNGusC (last written to [unknown]). WARNING: VG vg1 is missing PV jWPS0x-N955-flpA-sjdR-ufvF-dSgg-0ajGaF (last written to [unknown]). WARNING: VG vg1 is missing PV K7M9QZ-zEvp-hwJE-zRG2-4A2K-4lrw-l3ILE2 (last written to [unknown]). WARNING: VG vg1 is missing PV oMZ5Lm-xA03-IDe9-BIlM-JNcx-0n1M-RzTRj0 (last written to [unknown]). WARNING: VG vg1 is missing PV Zxzmcs-6SiS-tIUr-Gu4u-llk8-iG2N-qrxZ8u (last written to [unknown]). WARNING: VG vg1 is missing PV zPV3eW-lf2M-7dWm-hyic-3CQt-f53d-FL1cG3 (last written to [unknown]). WARNING: VG vg1 is missing PV 4UAniO-KL3o-u1x5-ojsm-QVC8-tuBs-igI6ZF (last written to [unknown]). Cannot change VG vg1 while PVs are missing. See vgreduce --removemissing and vgextend --restoremissing. Cannot process volume group vg1
After running 'vgreduce --removemissing' - cluster is back to normal and PVCs get Bound.
However, we see that the reported free storage is smaller than expected:
sh-5.1# vgs Devices file sys_wwid naa.60014051bbdbe3f239940cf91bc66e8e PVID ohBUUcR3XhoAbAyGslrev7Xmuv4EeQJl last seen on /dev/sdc not found. Devices file sys_wwid naa.3333333000008ca0 PVID jkAlTcIBAKq69CHiK21ryYhIXU0SjUTO last seen on /dev/sdc not found. Devices file sys_wwid naa.6001405f04919315d434c29b851e594e PVID nyx5Vm49UAAtHXT0Cv0ydl1662naI30q last seen on /dev/sdc not found. VG #PV #LV #SN Attr VSize VFree vg1 1 15 0 wz--n- 446.62g <43.57g
sh-5.1# lvs Devices file sys_wwid naa.60014051bbdbe3f239940cf91bc66e8e PVID ohBUUcR3XhoAbAyGslrev7Xmuv4EeQJl last seen on /dev/sdc not found. Devices file sys_wwid naa.3333333000008ca0 PVID jkAlTcIBAKq69CHiK21ryYhIXU0SjUTO last seen on /dev/sdc not found. Devices file sys_wwid naa.6001405f04919315d434c29b851e594e PVID nyx5Vm49UAAtHXT0Cv0ydl1662naI30q last seen on /dev/sdc not found. LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert 048f1af2-8ea3-4562-94e5-e1853c949dd6 vg1 Vwi-a-tz-- 30.00g thin-pool-1 16.67 51de4f0f-e4d1-42db-8695-63f81284fd0c vg1 Vwi-a-tz-- 1.00g thin-pool-1 8.13 6835a099-2d94-4639-8179-f40e90040262 vg1 Vwi-a-tz-- 30.00g thin-pool-1 33.33 6eca9314-4619-454e-bfdc-de9280dd7146 vg1 Vwi-a-tz-- 1.00g thin-pool-1 8.13 73db819f-e97b-458a-9707-64edc8a93a99 vg1 Vwi-a-tz-- 1.00g thin-pool-1 8.13 75eaa9fe-a61d-4a67-a8c9-bd111323c5fe vg1 Vwi-a-tz-- 512.00m thin-pool-1 0.00 84793682-1ed5-449a-97fd-6f157975e364 vg1 Vwi-a-tz-- 30.00g thin-pool-1 33.33 863bbc33-571c-47f5-a446-f6962bcecca8 vg1 Vwi-a-tz-- 1.00g thin-pool-1 8.13 88db478b-1183-4d69-8c2d-0baa6597a49c vg1 Vwi-a-tz-- 30.00g thin-pool-1 33.33 a8f8cbb6-358b-4ab8-8865-60c48a9c8f5d vg1 Vwi-a-tz-- 30.00g thin-pool-1 33.33 ad21e499-da7e-4f62-ad65-f9f8ab63de93 vg1 Vwi-a-tz-- 30.00g thin-pool-1 33.33 c9423aa9-103d-4da7-b962-aa582907a8f0 vg1 Vwi-a-tz-- 30.00g thin-pool-1 26.67 d14b3e7c-96bb-429b-90e1-b9ed4d94151d vg1 Vwi-a-tz-- 1.00g thin-pool-1 8.13 db2dcf3f-73e6-419f-be9a-312ade2bc79c vg1 Vwi-a-tz-- 1.00g thin-pool-1 8.13 thin-pool-1 vg1 twi-aotz-- <402.66g 15.77 12.35
Also, we see that after LUN creation, empty lvmdevices fix is not honored, so it's a regression of https://issues.redhat.com/browse/OCPBUGS-5223
sh-5.1# cat /etc/lvm/devices/system.devices
# LVM uses devices listed in this file.
# Created by LVM command vgextend pid 3989733 at Thu Feb 29 13:11:30 2024
VERSION=1.1.38
IDTYPE=sys_wwid IDNAME=naa.62cea7f05051440026bda6e52583db0c DEVNAME=/dev/sdb PVID=2919x7klrcW0YR6AfZ56vYHnSzkhmmdd
IDTYPE=sys_wwid IDNAME=naa.60014051bbdbe3f239940cf91bc66e8e DEVNAME=/dev/sdc PVID=ohBUUcR3XhoAbAyGslrev7Xmuv4EeQJl
IDTYPE=sys_wwid IDNAME=naa.3333333000008ca0 DEVNAME=/dev/sdc PVID=jkAlTcIBAKq69CHiK21ryYhIXU0SjUTO
IDTYPE=sys_wwid IDNAME=naa.6001405f04919315d434c29b851e594e DEVNAME=/dev/sdc PVID=nyx5Vm49UAAtHXT0Cv0ydl1662naI30q
LVMS team, please feel free to open another bug if there should be two separate fixes.
Version-Release number of selected component (if applicable):
4.15 ClusterID: f3901f31-d4e9-4a25-b90c-95d4e7a1ce88 ClusterVersion: Stable at "4.15.0-rc.7" ClusterOperators: All healthy and stable
Steps to Reproduce:
Create a LUN disk
Actual results:
LVMS provisioner gets broken with no obvious reason. Steps to fix are only found in the log (which is pretty long itself)
Expected results:
LUN disks should not break the provisioner.
Additional info:
- relates to
-
OCPSTRAT-1298 LVMS: make device discovery policy configurable
- Refinement
- links to