-
Bug
-
Resolution: Unresolved
-
Normal
-
None
-
rhel-9.2.0
-
None
-
Low
-
rhel-sst-virtualization-cloud
-
ssg_virtualization
-
None
-
False
-
-
None
-
None
-
None
-
Automated
-
If docs needed, set a value
-
-
Unspecified
-
None
Issue presents on latest ARM RHEL 9.2 (5.14.0-268.el9.aarch64, RHEL-9.2.0-20230219.31).
+++ This bug was initially created as a clone of Bug #2165169 +++
Description of problem:
Azure ARM RHEL 8.8 use below command to trigger crash -
- echo 1 > /proc/sys/kernel/sysrq; echo c > /proc/sysrq-trigger
Serial console shows below logs before a restart happens -
[ 36.517830] sysrq: SysRq : Trigger a crash
[ 36.520721] Kernel panic - not syncing: sysrq triggered crash
[ 36.520721]
[ 36.525900] CPU: 1 PID: 1766 Comm: bash Kdump: loaded Tainted: G W --------- - - 4.18.0-447.el8.aarch64 #1
[ 36.533966] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v4.1 05/16/2022
[ 36.542059] Call trace:
[ 36.543898] dump_backtrace+0x0/0x178
[ 36.546702] show_stack+0x28/0x38
[ 36.549147] dump_stack+0x5c/0x74
[ 36.551568] panic+0x140/0x30c
[ 36.553861] sysrq_reset_seq_param_set+0x0/0xa8
[ 36.557226] __handle_sysrq+0x9c/0x190
[ 36.559981] write_sysrq_trigger+0x7c/0x98
[ 36.562968] proc_reg_write+0x84/0xd8
[ 36.565670] __vfs_write+0x4c/0x90
[ 36.568121] vfs_write+0xb0/0x1b8
[ 36.570497] ksys_write+0x70/0xd8
[ 36.572884] __arm64_sys_write+0x28/0x38
[ 36.575752] do_el0_svc+0xb4/0x188
[ 36.578249] el0_sync_handler+0x88/0xac
[ 36.581059] el0_sync+0x140/0x180
[ 36.583462] SMP: stopping secondary CPUs
[ 36.586264] Kernel Offset: 0x404bc6d00000 from 0xffff800010000000
[ 36.590470] PHYS_OFFSET: 0xffffa6dc00000000
[ 36.593308] CPU features: 0x0000044a,84814210
[ 36.596941] Memory Limit: none
[ 41.601628] hv_vmbus: Waiting for VMBus UNLOAD to complete
[ 46.620415] hv_vmbus: Waiting for VMBus UNLOAD to complete
[ 51.636353] hv_vmbus: Waiting for VMBus UNLOAD to complete
[ 56.659883] hv_vmbus: Waiting for VMBus UNLOAD to complete
[ 61.676422] hv_vmbus: Waiting for VMBus UNLOAD to complete
[ 66.695090] hv_vmbus: Waiting for VMBus UNLOAD to complete
[ 71.714063] hv_vmbus: Waiting for VMBus UNLOAD to complete
[ 76.744787] hv_vmbus: Waiting for VMBus UNLOAD to complete
[ 81.762957] hv_vmbus: Waiting for VMBus UNLOAD to complete
[ 86.780668] hv_vmbus: Waiting for VMBus UNLOAD to complete
[ 91.799017] hv_vmbus: Waiting for VMBus UNLOAD to complete
[ 96.823126] hv_vmbus: Waiting for VMBus UNLOAD to complete
[ 101.836149] hv_vmbus: Waiting for VMBus UNLOAD to complete
[ 106.860822] hv_vmbus: Waiting for VMBus UNLOAD to complete
[ 111.875220] hv_vmbus: Waiting for VMBus UNLOAD to complete
[ 116.891644] hv_vmbus: Waiting for VMBus UNLOAD to complete
[ 121.911215] hv_vmbus: Waiting for VMBus UNLOAD to complete
[ 126.924527] hv_vmbus: Waiting for VMBus UNLOAD to complete
[ 131.945844] hv_vmbus: Waiting for VMBus UNLOAD to complete
[ 136.960236] hv_vmbus: Waiting for VMBus UNLOAD to complete
[ 136.974631] hv_vmbus: Continuing even though VMBus UNLOAD did not complete
[ 136.980798] Starting crashdump kernel...
[ 136.983798] Bye!
[ 0.000000] Booting Linux on physical CPU 0x0000000001 [0x413fd0c1]
Version-Release number of selected component (if applicable):
4.18.0-447.el8.aarch64
How reproducible:
100% when system fully boots.
Steps to Reproduce:
1. Start an Azure ARM RHEL 8.8 VM. Wait for a couple of minutes.
2. # echo 1 > /proc/sys/kernel/sysrq; echo c > /proc/sysrq-trigger
Actual results:
VMBus unload takes a long time before system can reboot.
Expected results:
No such stall.
Additional info:
1. No such issue on RHEL 9.2.
2. No such issue on x86_64 RHEL 8.8.
— Additional comment from Mohammed Gamal on 2023-02-18 01:03:51 CST —
Hi Li,
Are you using the same instance type with both RHEL 8.8 and 9.2?
Normally, VMBus unload routines shouldn't be invoked when kexec is in use. ARM64 doesn't have the same kexec handlers as x86, so that might be the reason why it's happning. However, the ARM64 code is identical on both 9.2 and 8.8, so it's unlikely that this is the reason.
Are you sure that it's not reproducible on 9.2 for ARM64?
— Additional comment from Li Tian on 2023-02-20 09:21:10 CST —
(In reply to Mohammed Gamal from comment #1)
You are right. It is reproducible on the latest ARM RHEL 9.2 (RHEL-9.2.0-20230219.31).
Also I'm noticing a bunch of repeated error logs at first boot -
...
[ 47.571274] hyperv_drm 5620e0c7-8062-4dce-aeb7-520c7ef76171: [drm] ERROR Unable to send packet via vmbus
[ 47.571281] hyperv_drm 5620e0c7-8062-4dce-aeb7-520c7ef76171: [drm] ERROR Unable to send packet via vmbus
[ 47.571288] hyperv_drm 5620e0c7-8062-4dce-aeb7-520c7ef76171: [drm] ERROR Unable to send packet via vmbus
[ 47.571295] hyperv_drm 5620e0c7-8062-4dce-aeb7-520c7ef76171: [drm] ERROR Unable to send packet via vmbus
[ 47.571300] hyperv_drm 5620e0c7-8062-4dce-aeb7-520c7ef76171: [drm] ERROR Unable to send packet via vmbus
[ 47.571306] hyperv_drm 5620e0c7-8062-4dce-aeb7-520c7ef76171: [drm] ERROR Unable to send packet via vmbus
[ 47.571312] hyperv_drm 5620e0c7-8062-4dce-aeb7-520c7ef76171: [drm] ERROR Unable to send packet via vmbus
[ 47.571318] hyperv_drm 5620e0c7-8062-4dce-aeb7-520c7ef76171: [drm] ERROR Unable to send packet via vmbus
[ 47.571324] hyperv_drm 5620e0c7-8062-4dce-aeb7-520c7ef76171: [drm] ERROR Unable to send packet via vmbus
[ 47.571341] hyperv_drm 5620e0c7-8062-4dce-aeb7-520c7ef76171: [drm] ERROR Unable to send packet via vmbus
[ 47.571998] hyperv_drm 5620e0c7-8062-4dce-aeb7-520c7ef76171: [drm] ERROR Unable to send packet via vmbus
[ 47.579515] hyperv_drm 5620e0c7-8062-4dce-aeb7-520c7ef76171: [drm] ERROR Unable to send packet via vmbus
...
- is blocked by
-
RHEL-7226 [Azure][ARM64][RHEL8] sysrq reboot waits a long time for VMBus unloading
- Planning
- external trackers