-
Bug
-
Resolution: Won't Do
-
Major
-
None
-
2023Q4
-
None
-
False
-
-
False
-
?
-
?
-
?
-
?
-
No
-
-
-
2023Q4
-
Moderate
Number of zuul baremetal jobs are failing during dataplane deployment running edpm ansible jobs. They fail at different steps of the deployment as seen in 1 and 2.
Looking at the console logs there is kernel panic which seems to be FIPS related. We've enabled FIPs recently.
[ 541.408389] Kernel panic - not syncing: Jitter RNG permanent health test failure
[ 541.408389] CPU: 0 PID: 30478 Comm: systemctl Not tainted 5.14.0-383.el9.x86_64 #1
[ 541.408389] Hardware name: Red Hat KVM/RHEL, BIOS edk2-20230524-3.el9 05/24/2023
[ 541.408389] Call Trace:
[ 541.408389] <TASK>
[ 541.408389] dump_stack_lvl+0x34/0x48
[ 541.408826] panic+0xfd/0x2f7
[ 541.408829] jent_kcapi_random.cold+0x15/0x43
[ 541.408831] drbg_seed+0x10a/0x440
[ 541.408835] drbg_generate+0xb9/0x320
[ 541.408837] ? drbg_hmac_generate+0x284/0x2f0
[ 541.408839] drbg_kcapi_random+0xd3/0x120
[ 541.408841] ? release_pages+0x16a/0x4c0
[ 541.408844] crypto_devrandom_read_iter+0xb6/0x220
[ 541.408859] ? _copy_to_iter+0x1d3/0x620
[ 541.408859] ? crypto_devrandom_read_iter+0x149/0x220
[ 541.408859] __do_sys_getrandom+0x9d/0x140
[ 541.408859] do_syscall_64+0x5c/0x90
[ 541.408859] ? __do_sys_getrandom+0xa9/0x140
[ 541.408859] ? switch_fpu_return+0x4c/0xd0
[ 541.408859] ? exit_to_user_mode_prepare+0xec/0x100
[ 541.408859] ? syscall_exit_to_user_mode+0x12/0x30
[ 541.408859] ? do_syscall_64+0x69/0x90
[ 541.408859] ? exc_page_fault+0x62/0x150
[ 541.408859] entry_SYSCALL_64_after_hwframe+0x72/0xdc
[ 541.408859] RIP: 0033:0x7f3af8a5ac87
[ 541.408859] Code: be 0b 00 00 00 eb b2 66 2e 0f 1f 84 00 00 00 00 00 66 90 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 3e 01 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 89 54 24 1c 48 89 74 24 10
[ 541.408859] RSP: 002b:00007ffff38fb758 EFLAGS: 00000246 ORIG_RAX: 000000000000013e
[ 541.408859] RAX: ffffffffffffffda RBX: 0000000000000010 RCX: 00007f3af8a5ac87
[ 541.408859] RDX: 0000000000000002 RSI: 0000000000000010 RDI: 00007ffff38fb770
[ 541.408859] RBP: 00007f3af8dfb180 R08: 00005646c9deee60 R09: 0000000000000000
[ 541.408859] R10: 0000000000020000 R11: 0000000000000246 R12: 0000000000000010
[ 541.408859] R13: 0000000000000010 R14: 00007ffff38fb770 R15: 0000000000000020
[ 541.408859] </TASK>
[ 601.477891] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 601.477897] rcu: 0-...0: (20 ticks this GP) idle=d12c/1/0x4000000000000000 softirq=56206/56211 fqs=11863
[ 601.477904] rcu: (detected by 2, t=60091 jiffies, g=163213, q=3619 ncpus=6)
[ 601.477908] Sending NMI from CPU 2 to CPUs 0:
[ 541.408859] NMI backtrace for cpu 0
[ 541.408859] CPU: 0 PID: 30478 Comm: systemctl Not tainted 5.14.0-383.el9.x86_64 #1
[ 541.408859] Hardware name: Red Hat KVM/RHEL, BIOS edk2-20230524-3.el9 05/24/2023
[ 541.408859] RIP: 0010:io_serial_in+0x15/0x20
[ 541.408859] Code: 00 0f 1f 40 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 0f b6 8f b9 00 00 00 0f b7 57 08 d3 e6 01 f2 ec <0f> b6 c0 e9 f3 2c 4d 00 0f 1f 00 90 90 90 90 90 90 90 90 90 90 90
[ 541.408859] RSP: 0018:ffffade688bcb870 EFLAGS: 00000006
[ 541.408859] RAX: ffffffff8c5a5f05 RBX: 0000000000000046 RCX: 0000000000000000
[ 541.408859] RDX: 00000000000003f9 RSI: 0000000000000001 RDI: ffffffff8eb72a20
[ 541.408859] RBP: 0000000000000100 R08: 0000000000000001 R09: 6372203a4f464e49
[ 541.408859] R10: 705f756372203a4f R11: 464e49203a756372 R12: 0000000000000045
[ 541.408859] R13: ffffffff8e88e5c0 R14: ffffffff8e88e5c0 R15: ffffffff8eb72a20
[ 541.408859] FS: 00007f3af883db40(0000) GS:ffff9d37b7c00000(0000) knlGS:0000000000000000
[ 541.408859] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 541.408859] CR2: 00007f3af8dff04c CR3: 0000000174778000 CR4: 0000000000350ef0
[ 541.408859] Call Trace:
[ 541.408859] <NMI>
[ 541.408859] ? show_trace_log_lvl+0x1c4/0x2df
[ 541.408859] ? show_trace_log_lvl+0x1c4/0x2df
[ 541.408859] ? serial8250_console_write+0x389/0x4e0
[ 541.408859] ? nmi_cpu_backtrace.cold+0x1b/0x70
[ 541.408859] ? nmi_cpu_backtrace_handler+0xd/0x20
[ 541.408859] ? nmi_handle+0x5e/0x120
[ 541.408859] ? default_do_nmi+0x40/0x130
[ 541.408859] ? exc_nmi+0x111/0x140
[ 541.408859] ? end_repeat_nmi+0x16/0x67
[ 541.408859] ? mem32_serial_in+0x5/0x20
[ 541.408859] ? io_serial_in+0x15/0x20
[ 541.408859] ? io_serial_in+0x15/0x20
[ 541.408859] ? io_serial_in+0x15/0x20
[ 541.408859] </NMI>
[ 541.408859] <TASK>
[ 541.408859] serial8250_console_write+0x389/0x4e0
[ 541.408859] __console_emit_next_record+0x215/0x3e0
[ 541.408859] console_unlock+0x213/0x320
[ 541.408859] panic+0x12f/0x2f7
[ 541.408859] jent_kcapi_random.cold+0x15/0x43
[ 541.408859] drbg_seed+0x10a/0x440
[ 541.408859] drbg_generate+0xb9/0x320
[ 541.408859] ? drbg_hmac_generate+0x284/0x2f0
[ 541.408859] drbg_kcapi_random+0xd3/0x120
[ 541.408859] ? release_pages+0x16a/0x4c0
[ 541.408859] crypto_devrandom_read_iter+0xb6/0x220
[ 541.408859] ? _copy_to_iter+0x1d3/0x620
[ 541.408859] ? crypto_devrandom_read_iter+0x149/0x220
[ 541.408859] __do_sys_getrandom+0x9d/0x140
[ 541.408859] do_syscall_64+0x5c/0x90
[ 541.408859] ? __do_sys_getrandom+0xa9/0x140
[ 541.408859] ? switch_fpu_return+0x4c/0xd0
[ 541.408859] ? exit_to_user_mode_prepare+0xec/0x100
[ 541.408859] ? syscall_exit_to_user_mode+0x12/0x30
[ 541.408859] ? do_syscall_64+0x69/0x90
[ 541.408859] ? exc_page_fault+0x62/0x150
[ 541.408859] entry_SYSCALL_64_after_hwframe+0x72/0xdc
[ 541.408859] RIP: 0033:0x7f3af8a5ac87
[ 541.408859] Code: be 0b 00 00 00 eb b2 66 2e 0f 1f 84 00 00 00 00 00 66 90 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 3e 01 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 89 54 24 1c 48 89 74 24 10
[ 541.408859] RSP: 002b:00007ffff38fb758 EFLAGS: 00000246 ORIG_RAX: 000000000000013e
[ 541.408859] RAX: ffffffffffffffda RBX: 0000000000000010 RCX: 00007f3af8a5ac87
[ 541.408859] RDX: 0000000000000002 RSI: 0000000000000010 RDI: 00007ffff38fb770
[ 541.408859] RBP: 00007f3af8dfb180 R08: 00005646c9deee60 R09: 0000000000000000
[ 541.408859] R10: 0000000000020000 R11: 0000000000000246 R12: 0000000000000010
[ 541.408859] R13: 0000000000000010 R14: 00007ffff38fb770 R15: 0000000000000020
[ 541.408859] </TASK>
[ 541.408859] Kernel Offset: 0xae00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[ 541.408859]{}{}[ end Kernel panic - not syncing: Jitter RNG permanent health test failure ]{}{}