-
Bug
-
Resolution: Unresolved
-
Normal
-
None
-
rhel-9.5
-
None
-
No
-
Low
-
sst_logical_storage
-
ssg_filesystems_storage_and_HA
-
None
-
False
-
-
None
-
None
-
None
-
None
-
-
s390x
-
None
While running GFS2 regression tests on kernel-5.14.0-503.6.1.el9_5.s390x, I'm seeing the following warning with dlm_add_requestqueue. Shortly after the node is fenced off.
How reproducible:
often but not consistent
DISTRO=RHEL-9.5.0-20240922.2
Testcase: brawl-gfs2-jdata-reg/d_io-flock
Sep 24 16:20:14 m04-rh3 kernel: -----------[ cut here ]-----------
Sep 24 16:20:14 m04-rh3 kernel: memcpy: detected field-spanning write (size 112) of single field "&e->request" at fs/dlm/requestqueue.c:45 (size 88)
Sep 24 16:20:14 m04-rh3 kernel: WARNING: CPU: 0 PID: 56 at fs/dlm/requestqueue.c:45 dlm_add_requestqueue+0x102/0x128 [dlm]
Sep 24 16:20:14 m04-rh3 kernel: Modules linked in: gfs2 dlm tls rfkill sunrpc vmur vfio_ccw mdev vfio_iommu_type1 vfio iommufd drm fuse i2c_core drm_panel_orientation_quirks lcs ctcm fsm zfcp scsi_transport_fc xfs libcrc32c ghash_s390 prng aes_s390 qeth_l2 bridge stp llc des_s390 libdes sha3_512_s390 sha3_256_s390 qeth qdio ccwgroup dm_mirror dm_region_hash dm_log dm_mod dasd_fba_mod dasd_eckd_mod dasd_mod pkey zcrypt
Sep 24 16:20:14 m04-rh3 kernel: CPU: 0 PID: 56 Comm: kworker/u256:2 Kdump: loaded Not tainted 5.14.0-503.6.1.el9_5.s390x #1
Sep 24 16:20:14 m04-rh3 kernel: Hardware name: IBM 3932 A02 Z06 (z/VM 7.3.0)
Sep 24 16:20:14 m04-rh3 kernel: Workqueue: dlm_recv process_recv_sockets [dlm]
Sep 24 16:20:14 m04-rh3 kernel: Krnl PSW : 0704c00180000000 000003ff802b4e7e (dlm_add_requestqueue+0x106/0x128 [dlm])
Sep 24 16:20:14 m04-rh3 kernel: R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3
Sep 24 16:20:14 m04-rh3 kernel: Krnl GPRS: c0000000ffffbfff 0000000000000027 0000000000000074 00000000c22d6a88
Sep 24 16:20:14 m04-rh3 kernel: 0000037fffef38f0 0000037fffef38e8 0000000000000000 0000000082f47458
Sep 24 16:20:14 m04-rh3 kernel: 0000000000000070 0000000084226018 000000008435f000 0000000082f47440
Sep 24 16:20:14 m04-rh3 kernel: 0000000082c12e00 0000000000000000 000003ff802b4e7a 0000037fffef3aa0
Sep 24 16:20:14 m04-rh3 kernel: Krnl Code: 000003ff802b4e6e: c020000323dd#011larl#011%r2,000003ff80319628#012 000003ff802b4e74: c0e5ffff10e0#011brasl#011%r14,000003ff80297034#012 #000003ff802b4e7a: af000000#011#011mc#0110,0#012 >000003ff802b4e7e: a7f4ffb3#011#011brc#01115,000003ff802b4de4#012 000003ff802b4e82: ec37ffa800d8#011ahik#011%r3,%r7,-88#012 000003ff802b4e88: c0200003239e#011larl#011%r2,000003ff803195c4#012 000003ff802b4e8e: b9140033#011#011lgfr#011%r3,%r3#012 000003ff802b4e92: eb7ff0a00004#011lmg#011%r7,%r15,160(%r15)
Sep 24 16:20:14 m04-rh3 kernel: Call Trace:
Sep 24 16:20:14 m04-rh3 kernel: [<000003ff802b4e7e>] dlm_add_requestqueue+0x106/0x128 [dlm]
Sep 24 16:20:14 m04-rh3 kernel: ([<000003ff802b4e7a>] dlm_add_requestqueue+0x102/0x128 [dlm])
Sep 24 16:20:14 m04-rh3 kernel: [<000003ff802a486e>] dlm_receive_buffer+0x1be/0x200 [dlm]
Sep 24 16:20:14 m04-rh3 kernel: [<000003ff802abc9c>] dlm_midcomms_receive_buffer_3_2+0x2b4/0x3e8 [dlm]
Sep 24 16:20:14 m04-rh3 kernel: [<000003ff802abeda>] dlm_process_incoming_buffer+0x10a/0x1d8 [dlm]
Sep 24 16:20:14 m04-rh3 kernel: [<000003ff802ae062>] receive_from_sock+0xca/0x240 [dlm]
Sep 24 16:20:14 m04-rh3 kernel: [<000003ff802ae20e>] process_recv_sockets+0x36/0x48 [dlm]
Sep 24 16:20:14 m04-rh3 kernel: [<00000000c1028892>] process_one_work+0x1c2/0x458
Sep 24 16:20:14 m04-rh3 kernel: [<00000000c102977e>] worker_thread+0x3ce/0x528
Sep 24 16:20:14 m04-rh3 kernel: [<00000000c1032c38>] kthread+0x108/0x110
Sep 24 16:20:14 m04-rh3 kernel: [<00000000c0fb2ebc>] __ret_from_fork+0x3c/0x58
Sep 24 16:20:14 m04-rh3 kernel: [<00000000c1984e12>] ret_from_fork+0xa/0x30
Sep 24 16:20:14 m04-rh3 kernel: Last Breaking-Event-Address:
Sep 24 16:20:14 m04-rh3 kernel: [<00000000c1003aec>] __warn_printk+0xd4/0xe0
Sep 24 16:20:14 m04-rh3 kernel: --[ end trace 0000000000000000 ]--
Sep 24 16:20:14 m04-rh3 kernel: dlm: brawl0: dlm_recover 3 generation 3 done: 120 ms
Node m04-rh3 is maked failed and fenced off.
Sep 24 16:22:22 m04-rh4 corosync[6425]: [TOTEM ] A processor failed, forming new configuration: token timed out (10650ms), waiting 12780ms for consensus.
Sep 24 16:22:35 m04-rh4 pacemaker-fenced[6441]: notice: Node m04-rh3 state is now lost