Uploaded image for project: 'RHEL'
  1. RHEL
  2. RHEL-95781

fapolicyd hangs userspace due to rpmdb race

Linking RHIVOS CVEs to...Migration: Automation ...Sync from "Extern...XMLWordPrintable

    • No
    • Critical
    • rhel-security-special-projects
    • ssg_security
    • 3
    • False
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • None
    • None
    • Unspecified
    • Unspecified
    • Unspecified
    • None

      We observed fapolicyd get stuck and therefore deadlock the entirety of userspace during an update of the RPM keys installed on the system.

      2025-06-06T15:56:26.975195+00:00 upgrade[2683]: Removing all existing RPM GPG public keys
      2025-06-06T15:56:27.079588+00:00 fapolicyd[1215]: It looks like there was an update of the system... Syncing DB.
      2025-06-06T15:56:27.080033+00:00 fapolicyd[1215]: Loading rpmdb backend
      2025-06-06T15:56:27.142909+00:00 upgrade[2683]: Importing RPM GPG public keys
      ...
      2025-06-06T15:59:03.587075+00:00 kernel: [  243.979736] INFO: task fapolicyd:1218 blocked for more than 120 seconds.
      2025-06-06T15:59:03.587096+00:00 kernel: [  243.980135]       Tainted: G S         O       6.12.26-11.0s15c63r5.el9.x86_64 #1
      2025-06-06T15:59:03.587097+00:00 kernel: [  243.980466] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      2025-06-06T15:59:03.587098+00:00 kernel: [  243.980806] task:fapolicyd       state:D stack:0     pid:1218  tgid:1215  ppid:1      flags:0x00004002
      2025-06-06T15:59:03.587103+00:00 kernel: [  243.980811] Call Trace:
      2025-06-06T15:59:03.587104+00:00 kernel: [  243.980813]  <TASK>
      2025-06-06T15:59:03.587105+00:00 kernel: [  243.980817]  __schedule+0x2ef/0x770
      2025-06-06T15:59:03.587106+00:00 kernel: [  243.980826]  schedule+0x1c/0x90
      2025-06-06T15:59:03.587106+00:00 kernel: [  243.980828]  schedule_timeout+0x12a/0x140
      2025-06-06T15:59:03.587108+00:00 kernel: [  243.980832]  __wait_for_common+0x91/0x1d0
      2025-06-06T15:59:03.587109+00:00 kernel: [  243.980834]  ? usleep_range_state+0x90/0x90
      2025-06-06T15:59:03.587124+00:00 kernel: [  243.980837]  wait_for_completion_state+0x1d/0x40
      2025-06-06T15:59:03.587125+00:00 kernel: [  243.980839]  call_usermodehelper_exec+0x171/0x1a0
      2025-06-06T15:59:03.587126+00:00 kernel: [  243.980855]  do_coredump+0x574/0xdb0
      2025-06-06T15:59:03.587127+00:00 kernel: [  243.980861]  ? __mod_memcg_lruvec_state+0x95/0x150
      2025-06-06T15:59:03.587127+00:00 kernel: [  243.980866]  ? free_debug_processing+0xc4/0x340
      2025-06-06T15:59:03.587128+00:00 kernel: [  243.980870]  ? get_signal+0x3b8/0x790
      2025-06-06T15:59:03.587129+00:00 kernel: [  243.980876]  ? kmem_cache_free+0x2a1/0x3b0
      2025-06-06T15:59:03.587129+00:00 kernel: [  243.980878]  ? get_signal+0x307/0x790
      2025-06-06T15:59:03.587130+00:00 kernel: [  243.980881]  get_signal+0x307/0x790
      2025-06-06T15:59:03.587131+00:00 kernel: [  243.980884]  arch_do_signal_or_restart+0x2a/0x1b0
      2025-06-06T15:59:03.587131+00:00 kernel: [  243.980891]  ? _raw_spin_unlock_irqrestore+0xa/0x20
      2025-06-06T15:59:03.587132+00:00 kernel: [  243.980894]  ? force_sig_info_to_task+0xec/0x110
      2025-06-06T15:59:03.587132+00:00 kernel: [  243.980897]  irqentry_exit_to_user_mode+0x10b/0x1d0
      2025-06-06T15:59:03.587133+00:00 kernel: [  243.980903]  asm_exc_page_fault+0x22/0x30
      2025-06-06T15:59:03.587134+00:00 kernel: [  243.980910] RIP: 0033:0x7f11349c1bc9
      2025-06-06T15:59:03.587134+00:00 kernel: [  243.980913] RSP: 002b:00007f11311faec0 EFLAGS: 00010246
      2025-06-06T15:59:03.587135+00:00 kernel: [  243.980915] RAX: 00007f11345c2000 RBX: 000000000000023b RCX: 0000000000001993
      2025-06-06T15:59:03.587135+00:00 kernel: [  243.980917] RDX: 0000000000001993 RSI: 0000000000002001 RDI: 0000000000001993
      2025-06-06T15:59:03.587136+00:00 kernel: [  243.980919] RBP: 00007f112026ac08 R08: 0000000000000000 R09: 00007f11200629c8
      2025-06-06T15:59:03.587137+00:00 kernel: [  243.980920] R10: 0000000000000000 R11: 168ab1e31ccb8a7e R12: 00007f11311faed8
      2025-06-06T15:59:03.587137+00:00 kernel: [  243.980922] R13: 00000000000009ed R14: 0000000000000000 R15: 0000000000000000
      2025-06-06T15:59:03.587138+00:00 kernel: [  243.980924]  </TASK>
      

      This continued until the host was forcibly rebooted via sysrq.

      We suspect this is a race on RPMDB locking, as the order of operations was:

      1. rpm -e --allmatches gpg-pubkey
      2. <fapolicyd notices and begins refreshing its view of rpmdb>
      3. rpmkeys --import /etc/pki/rpm-gpg/*

      It's possible this is fixed by the recent patches to fapolicyd master that have not yet made it into RHEL 9, but those are so fresh it's difficult to tell:

      1. Fix for rpmdb with SQLite3 backend
      2. Introduction of rpmdb transaction locking
      3. Revert "Fix for rpmdb with SQLite3 backend"
      4. Fix rpmdb locking issues by loading via separate process

              rsroka@redhat.com Radovan Sroka (Inactive)
              chris-riches-redhat Chris Riches (Inactive)
              Nutanix Confidential Group
              Radovan Sroka Radovan Sroka (Inactive)
              SSG Security QE SSG Security QE
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

                Created:
                Updated:
                Resolved: