-
Bug
-
Resolution: Duplicate
-
Undefined
-
None
-
rhel-9.4.z
-
No
-
Critical
-
rhel-security-special-projects
-
ssg_security
-
3
-
False
-
False
-
-
None
-
None
-
None
-
None
-
Unspecified
-
Unspecified
-
Unspecified
-
None
We observed fapolicyd get stuck and therefore deadlock the entirety of userspace during an update of the RPM keys installed on the system.
2025-06-06T15:56:26.975195+00:00 upgrade[2683]: Removing all existing RPM GPG public keys 2025-06-06T15:56:27.079588+00:00 fapolicyd[1215]: It looks like there was an update of the system... Syncing DB. 2025-06-06T15:56:27.080033+00:00 fapolicyd[1215]: Loading rpmdb backend 2025-06-06T15:56:27.142909+00:00 upgrade[2683]: Importing RPM GPG public keys ... 2025-06-06T15:59:03.587075+00:00 kernel: [ 243.979736] INFO: task fapolicyd:1218 blocked for more than 120 seconds. 2025-06-06T15:59:03.587096+00:00 kernel: [ 243.980135] Tainted: G S O 6.12.26-11.0s15c63r5.el9.x86_64 #1 2025-06-06T15:59:03.587097+00:00 kernel: [ 243.980466] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 2025-06-06T15:59:03.587098+00:00 kernel: [ 243.980806] task:fapolicyd state:D stack:0 pid:1218 tgid:1215 ppid:1 flags:0x00004002 2025-06-06T15:59:03.587103+00:00 kernel: [ 243.980811] Call Trace: 2025-06-06T15:59:03.587104+00:00 kernel: [ 243.980813] <TASK> 2025-06-06T15:59:03.587105+00:00 kernel: [ 243.980817] __schedule+0x2ef/0x770 2025-06-06T15:59:03.587106+00:00 kernel: [ 243.980826] schedule+0x1c/0x90 2025-06-06T15:59:03.587106+00:00 kernel: [ 243.980828] schedule_timeout+0x12a/0x140 2025-06-06T15:59:03.587108+00:00 kernel: [ 243.980832] __wait_for_common+0x91/0x1d0 2025-06-06T15:59:03.587109+00:00 kernel: [ 243.980834] ? usleep_range_state+0x90/0x90 2025-06-06T15:59:03.587124+00:00 kernel: [ 243.980837] wait_for_completion_state+0x1d/0x40 2025-06-06T15:59:03.587125+00:00 kernel: [ 243.980839] call_usermodehelper_exec+0x171/0x1a0 2025-06-06T15:59:03.587126+00:00 kernel: [ 243.980855] do_coredump+0x574/0xdb0 2025-06-06T15:59:03.587127+00:00 kernel: [ 243.980861] ? __mod_memcg_lruvec_state+0x95/0x150 2025-06-06T15:59:03.587127+00:00 kernel: [ 243.980866] ? free_debug_processing+0xc4/0x340 2025-06-06T15:59:03.587128+00:00 kernel: [ 243.980870] ? get_signal+0x3b8/0x790 2025-06-06T15:59:03.587129+00:00 kernel: [ 243.980876] ? kmem_cache_free+0x2a1/0x3b0 2025-06-06T15:59:03.587129+00:00 kernel: [ 243.980878] ? get_signal+0x307/0x790 2025-06-06T15:59:03.587130+00:00 kernel: [ 243.980881] get_signal+0x307/0x790 2025-06-06T15:59:03.587131+00:00 kernel: [ 243.980884] arch_do_signal_or_restart+0x2a/0x1b0 2025-06-06T15:59:03.587131+00:00 kernel: [ 243.980891] ? _raw_spin_unlock_irqrestore+0xa/0x20 2025-06-06T15:59:03.587132+00:00 kernel: [ 243.980894] ? force_sig_info_to_task+0xec/0x110 2025-06-06T15:59:03.587132+00:00 kernel: [ 243.980897] irqentry_exit_to_user_mode+0x10b/0x1d0 2025-06-06T15:59:03.587133+00:00 kernel: [ 243.980903] asm_exc_page_fault+0x22/0x30 2025-06-06T15:59:03.587134+00:00 kernel: [ 243.980910] RIP: 0033:0x7f11349c1bc9 2025-06-06T15:59:03.587134+00:00 kernel: [ 243.980913] RSP: 002b:00007f11311faec0 EFLAGS: 00010246 2025-06-06T15:59:03.587135+00:00 kernel: [ 243.980915] RAX: 00007f11345c2000 RBX: 000000000000023b RCX: 0000000000001993 2025-06-06T15:59:03.587135+00:00 kernel: [ 243.980917] RDX: 0000000000001993 RSI: 0000000000002001 RDI: 0000000000001993 2025-06-06T15:59:03.587136+00:00 kernel: [ 243.980919] RBP: 00007f112026ac08 R08: 0000000000000000 R09: 00007f11200629c8 2025-06-06T15:59:03.587137+00:00 kernel: [ 243.980920] R10: 0000000000000000 R11: 168ab1e31ccb8a7e R12: 00007f11311faed8 2025-06-06T15:59:03.587137+00:00 kernel: [ 243.980922] R13: 00000000000009ed R14: 0000000000000000 R15: 0000000000000000 2025-06-06T15:59:03.587138+00:00 kernel: [ 243.980924] </TASK>
This continued until the host was forcibly rebooted via sysrq.
We suspect this is a race on RPMDB locking, as the order of operations was:
- rpm -e --allmatches gpg-pubkey
- <fapolicyd notices and begins refreshing its view of rpmdb>
- rpmkeys --import /etc/pki/rpm-gpg/*
It's possible this is fixed by the recent patches to fapolicyd master that have not yet made it into RHEL 9, but those are so fresh it's difficult to tell: