Uploaded image for project: 'RHEL'
  1. RHEL
  2. RHEL-38632

Sanlock considers a powered off host as alive, breaking DR

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Normal Normal
    • rhel-9.6
    • rhel-8.6.0
    • sanlock
    • None
    • Important
    • sst_logical_storage
    • ssg_platform_storage
    • 3
    • False
    • Hide

      None

      Show
      None
    • None
    • Red Hat Enterprise Linux
    • None
    • None
    • None
    • x86_64
    • None

      What were you trying to do that didn't work?

      This is the scenario of:

      • 2 RHV Data-Centers A (active) and B (passive), one one each site.
      • Storage is replicated from A to B
      • Upon failure of first site, the Storage Domains replication is stopped, and the RHV on Site B imports the Storage Domains to initiate the DR procedure

      That import ultimately fails.
       
      To initiate the import on Site B, the RHV hypervisor first acquires id 250 on lockspace ids, this part works:
       

      2024-05-25 11:57:52 52 [53295]: sanlock daemon started 3.8.4 host 4c4c4544-0051-5210-8038-c8c04f585032 (c2prdrev51is.infra.e-mip.eu)
      2024-05-25 11:58:12 72 [53295]: cmd_status 2,12
      2024-05-25 12:05:37 517 [53295]: cmd_status 2,12
      2024-05-25 12:11:52 892 [53345]: cmd_add_lockspace 2,12 4484db8c-4e14-4f86-852b-b2cffe079c02:250:/dev/4484db8c-4e14-4f86-852b-b2cffe079c02/ids:0 flags 0 timeout 10
      2024-05-25 12:11:52 892 [53345]: s1 lockspace 4484db8c-4e14-4f86-852b-b2cffe079c02:250:/dev/4484db8c-4e14-4f86-852b-b2cffe079c02/ids:0
      2024-05-25 12:11:52 893 [92233]: s1 set_lockspace_max_sectors_kb hw_kb 32767 setting 1024
      2024-05-25 12:11:52 893 [92233]: write_sysfs_size open error 13 /sys/dev/block/253:195/queue/max_sectors_kb
      2024-05-25 12:11:52 893 [92233]: set_max_sectors_kb write 1024 error -1 /dev/4484db8c-4e14-4f86-852b-b2cffe079c02/ids
      2024-05-25 12:11:52 893 [92233]: s1 set_lockspace_max_sectors_kb hw_kb 32767 set 1024 error -1
      2024-05-25 12:11:52 893 [92233]: s1 delta_acquire begin 4484db8c-4e14-4f86-852b-b2cffe079c02:250
      2024-05-25 12:11:52 893 [92233]: s1 delta_acquire write 250 93 893 4c4c4544-0051-5210-8038-c8c04f585032
      2024-05-25 12:11:52 893 [92233]: s1 delta_acquire delta_short_delay 20
      2024-05-25 12:12:12 913 [92233]: s1 delta_acquire done 250 93 893
      2024-05-25 12:12:13 914 [53345]: s1 add_lockspace done
      2024-05-25 12:12:13 914 [53345]: cmd_add_lockspace 2,12 done 0
      2024-05-25 12:12:14 914 [53295]: s1 host 1 67 676851 4c4c4544-004e-4210-804a-b5c04f465632
      2024-05-25 12:12:14 914 [53295]: s1 host 2 74 673519 4c4c4544-004e-3710-804b-b5c04f465632
      2024-05-25 12:12:14 914 [53295]: s1 host 3 70 0 4c4c4544-004e-3910-8043-b5c04f465632
      2024-05-25 12:12:14 914 [53295]: s1 host 4 66 677373 4c4c4544-004e-4310-8043-b5c04f465632
      2024-05-25 12:12:14 914 [53295]: s1 host 5 22 676370 4c4c4544-0054-4a10-804c-b7c04f354233
      2024-05-25 12:12:14 914 [53295]: s1 host 6 19 676429 4c4c4544-0057-4a10-804c-b7c04f354233
      2024-05-25 12:12:14 914 [53295]: s1 host 7 21 674418 4c4c4544-0054-4a10-804c-b6c04f354233
      2024-05-25 12:12:14 914 [53295]: s1 host 8 19 674396 4c4c4544-0057-4a10-804c-b6c04f354233
      2024-05-25 12:12:14 914 [53295]: s1 host 9 15 674430 4c4c4544-0057-4a10-804c-b8c04f354233
      2024-05-25 12:12:14 914 [53295]: s1 host 10 17 674403 4c4c4544-0054-4a10-804c-b9c04f354233
      2024-05-25 12:12:14 914 [53295]: s1 host 250 93 893 4c4c4544-0051-5210-8038-c8c04f585032
      

      Now here comes the problem, the next step is it needs to acquire the SDM resource (leases lockspace).

       

      2024-05-25 12:12:16 917 [53346]: cmd_acquire 2,12,56848 ci_in 2 fd 12 count 1 flags 0
      2024-05-25 12:12:16 917 [53346]: s1:r2 resource 4484db8c-4e14-4f86-852b-b2cffe079c02:SDM:/dev/4484db8c-4e14-4f86-852b-b2cffe079c02/leases:1048576
      2024-05-25 12:12:16 917 [53346]: s1:r2 paxos_acquire begin offset 1048576 0x2 512 1048576
      2024-05-25 12:12:16 917 [53346]: s1:r2 leader 562 owner 5 22 348655 dblocks 4:1122005:1122005:5:22:348655:562:0,
      2024-05-25 12:12:16 917 [53346]: s1:r2 paxos_acquire leader 562 owner 5 22 348655 max mbal[4] 1122005 our_dblock 1102250 1102250 250 92 4471 552
      2024-05-25 12:12:16 917 [53346]: s1:r2 paxos_acquire owner 5 22 348655 host_status 5 22 676370 wait_start 914
      2024-05-25 12:12:16 917 [53346]: s1:r2 paxos_acquire owner 5 delta 5 22 676370 alive
      2024-05-25 12:12:16 917 [53346]: s1:r2 acquire_disk rv -243 lver 562 at 348655
      2024-05-25 12:12:16 917 [53346]: s1:r2 acquire_token held error -243
      2024-05-25 12:12:16 917 [53346]: s1:r2 release_token r_flags 0 lver 0
      2024-05-25 12:12:16 917 [53346]: s1:r2 release_token done r_flags 0
      2024-05-25 12:12:16 917 [53346]: s1:r2 cmd_acquire 2,12,56848 acquire_token -243 lease owned by other host
      

      More than half an hour later, we are still failing to grab SDM from the dead host, which has been powered off for quite some time now. Sanlock keeps going to that IDLIVE if clause:
       

      2024-05-25 12:45:34 2914 [53345]: s10:r11 resource 4484db8c-4e14-4f86-852b-b2cffe079c02:SDM:/dev/4484db8c-4e14-4f86-852b-b2cffe079c02/leases:1048576
      2024-05-25 12:45:34 2914 [53345]: s10:r11 paxos_acquire begin offset 1048576 0x2 512 1048576
      2024-05-25 12:45:34 2914 [53345]: s10:r11 leader 562 owner 5 22 348655 dblocks 4:1122005:1122005:5:22:348655:562:0,
      2024-05-25 12:45:34 2914 [53345]: s10:r11 paxos_acquire leader 562 owner 5 22 348655 max mbal[4] 1122005 our_dblock 1102250 1102250 250 92 4471 552
      2024-05-25 12:45:34 2914 [53345]: s10:r11 paxos_acquire owner 5 22 348655 host_status 5 22 676370 wait_start 2912
      2024-05-25 12:45:34 2914 [53345]: s10:r11 paxos_acquire owner 5 delta 5 22 676370 alive
      2024-05-25 12:45:34 2914 [53345]: s10:r11 acquire_disk rv -243 lver 562 at 348655
      2024-05-25 12:45:34 2914 [53345]: s10:r11 acquire_token held error -243 
      2024-05-25 12:45:34 2914 [53345]: s10:r11 release_token r_flags 0 lver 0 
      2024-05-25 12:45:34 2914 [53345]: s10:r11 release_token done r_flags 0
      2024-05-25 12:45:34 2914 [53345]: s10:r11 cmd_acquire 2,12,56848 acquire_token -243 lease owned by other host

      Note the host timestamp is still the exact same 676370, this current owner is definitely not alive.

      So it kept returning IDLIVE. Note the lease was owned by host with id 5 on the primary site, and that timestamp is still 676370. This host is powered off, with no access to storage, that site lost power.

      My understanding is that for some reason we entered this if clause to get to IDLIVE

      if ((host_id_leader.timestamp != last_timestamp) ||		        (hs.last_live && (hs.last_check == hs.last_live))) { 

      IMHO it shoulnd't be the first line, as that would be 676370 != 0000676370, which is false. A direct dump at the time:

        offset                            lockspace                                         resource  timestamp  own  gen lver
      00002048 4484db8c-4e14-4f86-852b-b2cffe079c02             4c4c4544-0054-4a10-804c-b7c04f354233 0000676370 0005 0022
      

      The sanlock logs seem to suggest it simply jumps into a conclusion that the host is live where it isn't. AFAIK, to be considered live, the sanlock on the host needs to see a renewal from the host currently owning the lock, but it doesn't, nor tries to wait.
      I could be way off, but my impression is we are taking this shortcut introduced here, but I don't really understand why.

       

      commit f1d109a5b5017c431db16b02f056a806f584c077
      Date:   Fri Jan 30 16:08:37 2015 -0600
       
          sanlock: fail acquire with IDLIVE more quickly 
      

      Maybe going into this part of the if clause instead of waiting to see fi the timestamp changes:

      (hs.last_live && (hs.last_check == hs.last_live)))

       

      Please provide the package NVR for which bug is seen:

      sanlock-3.8.4-3.el8.x86_64

      How reproducible:

      Unable so far, it works for me, and it does what I expect it to at that stage:

      2024-05-27 12:26:55 463 [815]: s1:r2 resource dacc82b4-abb0-4905-957c-4afcc5e586db:SDM:/dev/dacc82b4-abb0-4905-957c-4afcc5e586db/leases:1048576
      2024-05-27 12:26:55 463 [815]: s1:r2 paxos_acquire begin offset 1048576 0x2 512 1048576
      2024-05-27 12:26:55 463 [815]: s1:r2 leader 15 owner 1 14 1338 dblocks 0:28001:28001:1:14:1338:15:0,
      2024-05-27 12:26:55 463 [815]: s1:r2 paxos_acquire leader 15 owner 1 14 1338 max mbal[0] 28001 our_dblock 26250 26250 250 2 1172 14
      2024-05-27 12:26:55 463 [815]: s1 host_info 1 use own io_timeout 10
      2024-05-27 12:26:55 463 [815]: s1:r2 paxos_acquire owner 1 14 1338 host_status 0 0 0 wait_start 463
      2024-05-27 12:26:56 464 [808]: s1 host 1 14 3123 5ec17b69-be22-5d56-bfcc-44580c0a49dc
      2024-05-27 12:26:56 464 [808]: s1 host 2 9 182972 75b0ce70-0018-5a47-9457-a286253002ca
      2024-05-27 12:26:56 464 [808]: s1 host 250 3 442 75b0ce70-0018-5a47-9457-a286253002ca
      2024-05-27 12:26:56 464 [815]: s1:r2 paxos_acquire leader 15 owner 1 14 1338
      ....
      2024-05-27 12:29:16 604 [815]: s1:r2 paxos_acquire leader 15 owner 1 14 1338
      2024-05-27 12:29:16 604 [815]: s1:r2 paxos_acquire owner 1 14 1338 delta 1 14 3123 dead 604-463>140
      2024-05-27 12:29:16 604 [815]: s1:r2 paxos_acquire leader 15 owner 1 14 1338
      2024-05-27 12:29:16 604 [815]: s1:r2 ballot 16 phase1 write mbal 30250
      2024-05-27 12:29:16 604 [815]: s1:r2 ballot 16 phase1 read 249:30250:0:0:0:0:16:0,
      2024-05-27 12:29:16 604 [815]: s1:r2 ballot 16 phase2 write bal 30250 inp 250 3 604 q_max -1
      2024-05-27 12:29:16 604 [815]: s1:r2 ballot 16 phase2 read 249:30250:30250:250:3:604:16:0,
      2024-05-27 12:29:16 604 [815]: s1:r2 ballot 16 commit self owner 250 3 604
      2024-05-27 12:29:16 604 [815]: s1:r2 acquire_disk rv 1 lver 16 at 604
      

      Expected results

      SDM lease is acquired by the host, and DR can proceed

      Actual results

      Host on DR site cannot get SDM lease, DR failover does not work.

            teigland@redhat.com David Teigland
            rhn-support-gveitmic Germano Veit Michel
            David Teigland David Teigland
            Cluster QE Cluster QE
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated: