-
Bug
-
Resolution: Unresolved
-
Undefined
-
None
-
rhel-9.8
-
None
-
None
-
None
-
rhel-storage-lvm
-
None
-
False
-
False
-
-
None
-
None
-
None
-
None
-
Unspecified
-
Unspecified
-
Unspecified
-
None
After the following test scenario setup the environment to test and then verify RHEL-117154|RHEL-108163, it was left with a state where sanlock was not able to be started again. Is there a way to clean up from this with out a reboot?
kernel-5.14.0-625.el9 BUILT: Wed Oct 15 11:32:28 AM CEST 2025
lvm2-2.03.33-1.el9 BUILT: Tue Sep 30 02:15:40 PM CEST 2025
lvm2-libs-2.03.33-1.el9 BUILT: Tue Sep 30 02:15:40 PM CEST 2025
lvm2-lockd-2.03.33-1.el9 BUILT: Tue Sep 30 02:15:40 PM CEST 2025
sanlock-4.1.0-1.el9 BUILT: Thu Oct 9 02:00:39 PM CEST 2025
sanlock-lib-4.1.0-1.el9 BUILT: Thu Oct 9 02:00:39 PM CEST 2025
# Scenario that set up this state:
SCENARIO - force_remove_shared_vdo_vg_wo_global_lock_wo_daemons_running: Test the new force lockopt remove option when no global lock exists (RHEL-117154|RHEL-108163)
Present shared storage view and enable locking on other nodes
Setting use_lvmlockd to enable
Setting lvmlocal.conf host_id to 990
(virt-495.cluster-qe.lab.eng.brq.redhat.com): systemctl start sanlock
(virt-495.cluster-qe.lab.eng.brq.redhat.com): systemctl start lvmlockd
adding entry to the devices file for /dev/sda on virt-495.cluster-qe.lab.eng.brq.redhat.com
lvmdevices -y --config devices/scan_lvs=1 --adddev /dev/sda
creating PV on virt-495.cluster-qe.lab.eng.brq.redhat.com using device /dev/sda
pvcreate --yes -ff --nolock /dev/sda
Physical volume "/dev/sda" successfully created.
adding entry to the devices file for /dev/sdb on virt-495.cluster-qe.lab.eng.brq.redhat.com
lvmdevices -y --config devices/scan_lvs=1 --adddev /dev/sdb
creating PV on virt-495.cluster-qe.lab.eng.brq.redhat.com using device /dev/sdb
pvcreate --yes -ff --nolock /dev/sdb
Physical volume "/dev/sdb" successfully created.
adding entry to the devices file for /dev/sdc on virt-495.cluster-qe.lab.eng.brq.redhat.com
lvmdevices -y --config devices/scan_lvs=1 --adddev /dev/sdc
creating PV on virt-495.cluster-qe.lab.eng.brq.redhat.com using device /dev/sdc
pvcreate --yes -ff --nolock /dev/sdc
Physical volume "/dev/sdc" successfully created.
adding entry to the devices file for /dev/sdd on virt-495.cluster-qe.lab.eng.brq.redhat.com
lvmdevices -y --config devices/scan_lvs=1 --adddev /dev/sdd
creating PV on virt-495.cluster-qe.lab.eng.brq.redhat.com using device /dev/sdd
pvcreate --yes -ff --nolock /dev/sdd
Physical volume "/dev/sdd" successfully created.
adding entry to the devices file for /dev/sde on virt-495.cluster-qe.lab.eng.brq.redhat.com
lvmdevices -y --config devices/scan_lvs=1 --adddev /dev/sde
creating PV on virt-495.cluster-qe.lab.eng.brq.redhat.com using device /dev/sde
pvcreate --yes -ff --nolock /dev/sde
Physical volume "/dev/sde" successfully created.
adding entry to the devices file for /dev/sdf on virt-495.cluster-qe.lab.eng.brq.redhat.com
lvmdevices -y --config devices/scan_lvs=1 --adddev /dev/sdf
creating PV on virt-495.cluster-qe.lab.eng.brq.redhat.com using device /dev/sdf
pvcreate --yes -ff --nolock /dev/sdf
Physical volume "/dev/sdf" successfully created.
creating VG on virt-495.cluster-qe.lab.eng.brq.redhat.com using PV(s) /dev/sda
vgcreate --shared vdo_sanity_global /dev/sda
Enabling sanlock global lock
Logical volume "lvmlock" created.
Volume group "vdo_sanity_global" successfully created
VG vdo_sanity_global starting sanlock lockspace
Starting locking. Waiting until locks are ready...
creating VG on virt-495.cluster-qe.lab.eng.brq.redhat.com using PV(s) /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf
vgcreate --shared vdo_sanity_force_remove /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf
Logical volume "lvmlock" created.
Volume group "vdo_sanity_force_remove" successfully created
VG vdo_sanity_force_remove starting sanlock lockspace
Starting locking. Waiting until locks are ready...
lvcreate --yes --type vdo -n vdo_lv -aey -L 25G vdo_sanity_force_remove -V 25G
Wiping vdo signature on /dev/vdo_sanity_force_remove/vpool0.
The VDO volume can address 22 GB in 11 data slabs, each 2 GB.
It can grow to address at most 16 TB of physical storage in 8192 slabs.
If a larger maximum size might be needed, use bigger slabs.
Logical volume "vdo_lv" created.
deactivating LV vdo_lv on virt-495.cluster-qe.lab.eng.brq.redhat.com
lvchange --yes -an vdo_sanity_force_remove/vdo_lv
(virt-495.cluster-qe.lab.eng.brq.redhat.com): systemctl stop sanlock
(virt-495.cluster-qe.lab.eng.brq.redhat.com): systemctl stop lvmlockd
WARNING: lvmlockd process is not running.
Reading without shared global lock.
Reading VG vdo_sanity_force_remove without a lock.
Reading VG vdo_sanity_global without a lock.
vgremove vdo_sanity_force_remove
vgremove -ff vdo_sanity_force_remove
vgremove --nolocking --yes vdo_sanity_force_remove
vgremove --lockopt force vdo_sanity_force_remove
Check for new override flag when no global lock exists (RHEL-117154|RHEL-108163)
vgremove --nolocking --lockopt force --yes vdo_sanity_force_remove
Volume group "vdo_sanity_force_remove" successfully removed
vgremove --nolocking --lockopt force --yes vdo_sanity_global
Volume group "vdo_sanity_global" successfully removed
Setting use_lvmlockd to disable
Disabling lvmlocal.conf use of host_id
removing pv /dev/sda on virt-495.cluster-qe.lab.eng.brq.redhat.com
Labels on physical volume "/dev/sda" successfully wiped.
removing entry from the devices file for /dev/sda on virt-495.cluster-qe.lab.eng.brq.redhat.com
lvmdevices -y --config devices/scan_lvs=1 --deldev /dev/sda
removing pv /dev/sdb on virt-495.cluster-qe.lab.eng.brq.redhat.com
Labels on physical volume "/dev/sdb" successfully wiped.
removing entry from the devices file for /dev/sdb on virt-495.cluster-qe.lab.eng.brq.redhat.com
lvmdevices -y --config devices/scan_lvs=1 --deldev /dev/sdb
removing pv /dev/sdc on virt-495.cluster-qe.lab.eng.brq.redhat.com
Labels on physical volume "/dev/sdc" successfully wiped.
removing entry from the devices file for /dev/sdc on virt-495.cluster-qe.lab.eng.brq.redhat.com
lvmdevices -y --config devices/scan_lvs=1 --deldev /dev/sdc
removing pv /dev/sdd on virt-495.cluster-qe.lab.eng.brq.redhat.com
Labels on physical volume "/dev/sdd" successfully wiped.
removing entry from the devices file for /dev/sdd on virt-495.cluster-qe.lab.eng.brq.redhat.com
lvmdevices -y --config devices/scan_lvs=1 --deldev /dev/sdd
removing pv /dev/sde on virt-495.cluster-qe.lab.eng.brq.redhat.com
Labels on physical volume "/dev/sde" successfully wiped.
removing entry from the devices file for /dev/sde on virt-495.cluster-qe.lab.eng.brq.redhat.com
lvmdevices -y --config devices/scan_lvs=1 --deldev /dev/sde
removing pv /dev/sdf on virt-495.cluster-qe.lab.eng.brq.redhat.com
Labels on physical volume "/dev/sdf" successfully wiped.
removing entry from the devices file for /dev/sdf on virt-495.cluster-qe.lab.eng.brq.redhat.com
lvmdevices -y --config devices/scan_lvs=1 --deldev /dev/sdf
Searching for alignment inconsistency warnings in /var/log/messages
# POST Scenario state
[root@virt-495 ~]# systemctl status sanlock
à sanlock.service - Shared Storage Lease Manager
Loaded: loaded (/usr/lib/systemd/system/sanlock.service; disabled; preset: disabled)
Active: failed (Result: timeout) since Tue 2025-11-04 18:12:45 CET; 7min ago
Duration: 1min 38.258s
Docs: man:sanlock(8)
Process: 1824 ExecStart=/usr/sbin/sanlock daemon (code=exited, status=0/SUCCESS)
Main PID: 1828
Tasks: 5 (limit: 24974)
Memory: 27.0M (peak: 31.3M)
CPU: 887ms
CGroup: /system.slice/sanlock.service
ââ1828 /usr/sbin/sanlock daemon
Nov 04 18:08:07 virt-495.cluster-qe.lab.eng.brq.redhat.com systemd[1]: Starting Shared Storage Lease Manager...
Nov 04 18:08:07 virt-495.cluster-qe.lab.eng.brq.redhat.com systemd[1]: Started Shared Storage Lease Manager.
Nov 04 18:08:07 virt-495.cluster-qe.lab.eng.brq.redhat.com sanlock[1828]: sanlock daemon started 4.1.0 host 7401488b-d1f8-4a72-bc64-7e5a54730b9a.virt-495.cl (virt-495.cluster-qe.lab.eng.brq.redhat.com)
Nov 04 18:09:45 virt-495.cluster-qe.lab.eng.brq.redhat.com systemd[1]: Stopping Shared Storage Lease Manager...
Nov 04 18:09:45 virt-495.cluster-qe.lab.eng.brq.redhat.com sanlock[1828]: 2025-11-04 18:09:45 780 [1828]: helper pid 1829 term signal 15
Nov 04 18:11:15 virt-495.cluster-qe.lab.eng.brq.redhat.com systemd[1]: sanlock.service: State 'stop-sigterm' timed out. Skipping SIGKILL.
Nov 04 18:12:45 virt-495.cluster-qe.lab.eng.brq.redhat.com systemd[1]: sanlock.service: State 'final-sigterm' timed out. Skipping SIGKILL. Entering failed mode.
Nov 04 18:12:45 virt-495.cluster-qe.lab.eng.brq.redhat.com systemd[1]: sanlock.service: Failed with result 'timeout'.
Nov 04 18:12:45 virt-495.cluster-qe.lab.eng.brq.redhat.com systemd[1]: sanlock.service: Unit process 1828 (sanlock) remains running after unit stopped.
Nov 04 18:12:45 virt-495.cluster-qe.lab.eng.brq.redhat.com systemd[1]: Stopped Shared Storage Lease Manager.
[root@virt-495 ~]# systemctl stop sanlock
[root@virt-495 ~]# echo $?
0
[root@virt-495 ~]# systemctl start sanlock
Job for sanlock.service failed because of unavailable resources or another system error.
See "systemctl status sanlock.service" and "journalctl -xeu sanlock.service" for details.
[root@virt-495 ~]# echo $?
1
[root@virt-495 ~]# sanlock gets -h 1
gets error -111
[root@virt-495 ~]# dmsetup ls
rhel_virt--495-root (253:0)
rhel_virt--495-swap (253:1)
[root@virt-495 ~]# sanlock status
[root@virt-495 ~]# systemctl status sanlock.service
à sanlock.service - Shared Storage Lease Manager
Loaded: loaded (/usr/lib/systemd/system/sanlock.service; disabled; preset: disabled)
Active: failed (Result: resources) since Tue 2025-11-04 18:21:00 CET; 2min 21s ago
Duration: 1min 38.258s
Docs: man:sanlock(8)
CPU: 897ms
Nov 04 18:20:59 virt-495.cluster-qe.lab.eng.brq.redhat.com systemd[1]: sanlock.service: Will not start SendSIGKILL=no service of type KillMode=control-group or mixed while processes exist
Nov 04 18:20:59 virt-495.cluster-qe.lab.eng.brq.redhat.com systemd[1]: sanlock.service: Failed to run 'start' task: Device or resource busy
Nov 04 18:20:59 virt-495.cluster-qe.lab.eng.brq.redhat.com systemd[1]: Starting Shared Storage Lease Manager...
Nov 04 18:21:00 virt-495.cluster-qe.lab.eng.brq.redhat.com systemd[1]: sanlock.service: Failed with result 'resources'.
Nov 04 18:21:00 virt-495.cluster-qe.lab.eng.brq.redhat.com systemd[1]: Failed to start Shared Storage Lease Manager.
[root@virt-495 ~]# journalctl -xeu sanlock.service
ââ A start job for unit sanlock.service has finished successfully.
ââ
ââ The job identifier is 1588.
Nov 04 18:08:07 virt-495.cluster-qe.lab.eng.brq.redhat.com sanlock[1828]: sanlock daemon started 4.1.0 host 7401488b-d1f8-4a72-bc64-7e5a54730b9a.virt-495.cl (virt-495.cluster-qe.lab.eng.brq.redhat.com)
Nov 04 18:09:45 virt-495.cluster-qe.lab.eng.brq.redhat.com systemd[1]: Stopping Shared Storage Lease Manager...
ââ Subject: A stop job for unit sanlock.service has begun execution
ââ Defined-By: systemd
ââ Support: https://access.redhat.com/support
ââ
ââ A stop job for unit sanlock.service has begun execution.
ââ
ââ The job identifier is 4692.
Nov 04 18:09:45 virt-495.cluster-qe.lab.eng.brq.redhat.com sanlock[1828]: 2025-11-04 18:09:45 780 [1828]: helper pid 1829 term signal 15
Nov 04 18:11:15 virt-495.cluster-qe.lab.eng.brq.redhat.com systemd[1]: sanlock.service: State 'stop-sigterm' timed out. Skipping SIGKILL.
Nov 04 18:12:45 virt-495.cluster-qe.lab.eng.brq.redhat.com systemd[1]: sanlock.service: State 'final-sigterm' timed out. Skipping SIGKILL. Entering failed mode.
Nov 04 18:12:45 virt-495.cluster-qe.lab.eng.brq.redhat.com systemd[1]: sanlock.service: Failed with result 'timeout'.
ââ Subject: Unit failed
ââ Defined-By: systemd
ââ Support: https://access.redhat.com/support
ââ
ââ The unit sanlock.service has entered the 'failed' state with result 'timeout'.
Nov 04 18:12:45 virt-495.cluster-qe.lab.eng.brq.redhat.com systemd[1]: sanlock.service: Unit process 1828 (sanlock) remains running after unit stopped.
Nov 04 18:12:45 virt-495.cluster-qe.lab.eng.brq.redhat.com systemd[1]: Stopped Shared Storage Lease Manager.
ââ Subject: A stop job for unit sanlock.service has finished
ââ Defined-By: systemd
ââ Support: https://access.redhat.com/support
ââ
ââ A stop job for unit sanlock.service has finished.
ââ
ââ The job identifier is 4692 and the job result is done.
Nov 04 18:20:59 virt-495.cluster-qe.lab.eng.brq.redhat.com systemd[1]: sanlock.service: Will not start SendSIGKILL=no service of type KillMode=control-group or mixed while processes exist
Nov 04 18:20:59 virt-495.cluster-qe.lab.eng.brq.redhat.com systemd[1]: sanlock.service: Failed to run 'start' task: Device or resource busy
Nov 04 18:20:59 virt-495.cluster-qe.lab.eng.brq.redhat.com systemd[1]: Starting Shared Storage Lease Manager...
ââ Subject: A start job for unit sanlock.service has begun execution
ââ Defined-By: systemd
ââ Support: https://access.redhat.com/support
ââ
ââ A start job for unit sanlock.service has begun execution.
ââ
ââ The job identifier is 7169.
Nov 04 18:21:00 virt-495.cluster-qe.lab.eng.brq.redhat.com systemd[1]: sanlock.service: Failed with result 'resources'.
ââ Subject: Unit failed
ââ Defined-By: systemd
ââ Support: https://access.redhat.com/support
ââ
ââ The unit sanlock.service has entered the 'failed' state with result 'resources'.
Nov 04 18:21:00 virt-495.cluster-qe.lab.eng.brq.redhat.com systemd[1]: Failed to start Shared Storage Lease Manager.
ââ Subject: A start job for unit sanlock.service has failed
ââ Defined-By: systemd
ââ Support: https://access.redhat.com/support
ââ
ââ A start job for unit sanlock.service has finished with a failure.
ââ
ââ The job identifier is 7169 and the job result is failed.
[root@virt-495 ~]# ps -elf | grep 7169
0 S root 3847 1607 0 80 0 - 1604 pipe_r 18:25 pts/0 00:00:00 grep --color=auto 7169
[root@virt-495 ~]# ps -elf | grep 4692
0 S root 3875 1607 0 80 0 - 1604 pipe_r 18:28 pts/0 00:00:00 grep --color=auto 4692
[root@virt-495 ~]# ps -elf | grep sanlock
0 S root 3879 1607 0 80 0 - 1604 pipe_r 18:28 pts/0 00:00:00 grep --color=auto sanlock