Uploaded image for project: 'RHEL'
  1. RHEL
  2. RHEL-60001

DLM: kernel warning in dlm_add_requestqueue on s390x

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Normal Normal
    • None
    • rhel-9.5
    • dlm
    • None
    • No
    • Low
    • sst_logical_storage
    • ssg_filesystems_storage_and_HA
    • None
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • None
    • None
    • s390x
    • None

      While running GFS2 regression tests on kernel-5.14.0-503.6.1.el9_5.s390x,  I'm seeing the following warning with dlm_add_requestqueue. Shortly after the node is fenced off. 

      How reproducible:
      often but not consistent 

      DISTRO=RHEL-9.5.0-20240922.2
      Testcase: brawl-gfs2-jdata-reg/d_io-flock

      Sep 24 16:20:14 m04-rh3 kernel: -----------[ cut here ]-----------
      Sep 24 16:20:14 m04-rh3 kernel: memcpy: detected field-spanning write (size 112) of single field "&e->request" at fs/dlm/requestqueue.c:45 (size 88)
      Sep 24 16:20:14 m04-rh3 kernel: WARNING: CPU: 0 PID: 56 at fs/dlm/requestqueue.c:45 dlm_add_requestqueue+0x102/0x128 [dlm]
      Sep 24 16:20:14 m04-rh3 kernel: Modules linked in: gfs2 dlm tls rfkill sunrpc vmur vfio_ccw mdev vfio_iommu_type1 vfio iommufd drm fuse i2c_core drm_panel_orientation_quirks lcs ctcm fsm zfcp scsi_transport_fc xfs libcrc32c ghash_s390 prng aes_s390 qeth_l2 bridge stp llc des_s390 libdes sha3_512_s390 sha3_256_s390 qeth qdio ccwgroup dm_mirror dm_region_hash dm_log dm_mod dasd_fba_mod dasd_eckd_mod dasd_mod pkey zcrypt
      Sep 24 16:20:14 m04-rh3 kernel: CPU: 0 PID: 56 Comm: kworker/u256:2 Kdump: loaded Not tainted 5.14.0-503.6.1.el9_5.s390x #1
      Sep 24 16:20:14 m04-rh3 kernel: Hardware name: IBM 3932 A02 Z06 (z/VM 7.3.0)
      Sep 24 16:20:14 m04-rh3 kernel: Workqueue: dlm_recv process_recv_sockets [dlm]
      Sep 24 16:20:14 m04-rh3 kernel: Krnl PSW : 0704c00180000000 000003ff802b4e7e (dlm_add_requestqueue+0x106/0x128 [dlm])
      Sep 24 16:20:14 m04-rh3 kernel:           R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3
      Sep 24 16:20:14 m04-rh3 kernel: Krnl GPRS: c0000000ffffbfff 0000000000000027 0000000000000074 00000000c22d6a88
      Sep 24 16:20:14 m04-rh3 kernel:           0000037fffef38f0 0000037fffef38e8 0000000000000000 0000000082f47458
      Sep 24 16:20:14 m04-rh3 kernel:           0000000000000070 0000000084226018 000000008435f000 0000000082f47440
      Sep 24 16:20:14 m04-rh3 kernel:           0000000082c12e00 0000000000000000 000003ff802b4e7a 0000037fffef3aa0
      Sep 24 16:20:14 m04-rh3 kernel: Krnl Code: 000003ff802b4e6e: c020000323dd#011larl#011%r2,000003ff80319628#012           000003ff802b4e74: c0e5ffff10e0#011brasl#011%r14,000003ff80297034#012          #000003ff802b4e7a: af000000#011#011mc#0110,0#012          >000003ff802b4e7e: a7f4ffb3#011#011brc#01115,000003ff802b4de4#012           000003ff802b4e82: ec37ffa800d8#011ahik#011%r3,%r7,-88#012           000003ff802b4e88: c0200003239e#011larl#011%r2,000003ff803195c4#012           000003ff802b4e8e: b9140033#011#011lgfr#011%r3,%r3#012           000003ff802b4e92: eb7ff0a00004#011lmg#011%r7,%r15,160(%r15)
      Sep 24 16:20:14 m04-rh3 kernel: Call Trace:
      Sep 24 16:20:14 m04-rh3 kernel: [<000003ff802b4e7e>] dlm_add_requestqueue+0x106/0x128 [dlm] 
      Sep 24 16:20:14 m04-rh3 kernel: ([<000003ff802b4e7a>] dlm_add_requestqueue+0x102/0x128 [dlm])
      Sep 24 16:20:14 m04-rh3 kernel: [<000003ff802a486e>] dlm_receive_buffer+0x1be/0x200 [dlm] 
      Sep 24 16:20:14 m04-rh3 kernel: [<000003ff802abc9c>] dlm_midcomms_receive_buffer_3_2+0x2b4/0x3e8 [dlm] 
      Sep 24 16:20:14 m04-rh3 kernel: [<000003ff802abeda>] dlm_process_incoming_buffer+0x10a/0x1d8 [dlm] 
      Sep 24 16:20:14 m04-rh3 kernel: [<000003ff802ae062>] receive_from_sock+0xca/0x240 [dlm] 
      Sep 24 16:20:14 m04-rh3 kernel: [<000003ff802ae20e>] process_recv_sockets+0x36/0x48 [dlm] 
      Sep 24 16:20:14 m04-rh3 kernel: [<00000000c1028892>] process_one_work+0x1c2/0x458 
      Sep 24 16:20:14 m04-rh3 kernel: [<00000000c102977e>] worker_thread+0x3ce/0x528 
      Sep 24 16:20:14 m04-rh3 kernel: [<00000000c1032c38>] kthread+0x108/0x110 
      Sep 24 16:20:14 m04-rh3 kernel: [<00000000c0fb2ebc>] __ret_from_fork+0x3c/0x58 
      Sep 24 16:20:14 m04-rh3 kernel: [<00000000c1984e12>] ret_from_fork+0xa/0x30 
      Sep 24 16:20:14 m04-rh3 kernel: Last Breaking-Event-Address:
      Sep 24 16:20:14 m04-rh3 kernel: [<00000000c1003aec>] __warn_printk+0xd4/0xe0
      Sep 24 16:20:14 m04-rh3 kernel: --[ end trace 0000000000000000 ]--
      Sep 24 16:20:14 m04-rh3 kernel: dlm: brawl0: dlm_recover 3 generation 3 done: 120 ms

      Node m04-rh3 is maked failed and fenced off. 
      Sep 24 16:22:22 m04-rh4 corosync[6425]:  [TOTEM ] A processor failed, forming new configuration: token timed out (10650ms), waiting 12780ms for consensus.
      Sep 24 16:22:35 m04-rh4 pacemaker-fenced[6441]: notice: Node m04-rh3 state is now lost

            aahringo Alexander Aring
            rhn-support-cmackows Chris Mackowski
            David Teigland David Teigland
            Cluster QE Cluster QE
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated: