-
Bug
-
Resolution: Won't Do
-
Critical
-
None
-
rhelai-1.5, rhelai-1.5.1
-
None
-
False
-
-
False
-
-
With status:
staged: null booted: image: image: image: registry.stage.redhat.io/rhelai1/bootc-azure-amd-rhel9:1.5 transport: registry version: 9.20250429.0 timestamp: null imageDigest: sha256:2be4aeff6aaa4df6d8af66ac01e618c300f8551b290b2eb292fce709d58024d6 cachedUpdate: null incompatible: false pinned: false store: ostreeContainer ostree: checksum: 5dcdaa5924428c390ba317d0ff44743876afd653a2f199cf3a5f2093a57ca783 deploySerial: 0
[cloud-user@lab-mi300x ~]$ sudo podman images
REPOSITORY TAG IMAGE ID CREATED SIZE R/O
registry.redhat.io/rhelai1/instructlab-amd-rhel9 1.5.0 a6bc07dafea8 8 days ago 38.3 GB true
[cloud-user@lab-mi300x ~]$ podman run -it --rm --entrypoint "" registry.redhat.io/rhelai1/instructlab-amd-rhel9:1.5.0 /bin/bash
(app-root) /$ rocm-smi
WARNING: No AMD GPUs specified
===================================== ROCm System Management Interface =====================================
=============================================== Concise Info ===============================================
Device Node IDs Temp Power Partitions SCLK MCLK Fan Perf PwrCap VRAM% GPU%
(DID, GUID) (Edge) (Avg) (Mem, Compute, ID)
============================================================================================================
============================================================================================================
=========================================== End of ROCm SMI Log ============================================
(app-root) /$
Output from dmesg:
... [ 1700.518583] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1700.526980] watchdog: BUG: soft lockup - CPU#48 stuck for 104s! [kworker/48:1:787] [ 1700.530326] Modules linked in: nft_counter xt_owner xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_compat nf_tables nfnetlink rfkill ext4 mbcache jbd2 mlx5_ib intel_rapl_msr intel_rapl_common intel_uncore_frequency_common ib_uverbs nfit vfat fat ib_core libnvdimm amdgpu(OE+) mlx5_core mlxfw rapl psample hyperv_drm hv_netvsc pcspkr tls drm_shmem_helper hv_utils hv_balloon joydev amddrm_ttm_helper(OE) amdttm(OE) amddrm_buddy(OE) amdxcp(OE) amd_sched(OE) amdkcl(OE) drm_exec video wmi i2c_algo_bit drm_suballoc_helper drm_display_helper cec drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm zram overlay erofs loop xfs libcrc32c nvme nvme_core nvme_common sd_mod t10_pi sg hv_storvsc serio_raw pci_hyperv scsi_transport_fc hid_hyperv hyperv_keyboard pci_hyperv_intf crct10dif_pclmul crc32_pclmul crc32c_intel hv_vmbus ghash_clmulni_intel dm_mirror dm_region_hash dm_log dm_mod fuse [ 1700.530373] CPU: 48 PID: 787 Comm: kworker/48:1 Tainted: G W OEL X ------- --- 5.14.0-427.65.1.el9_4.x86_64 #1 [ 1700.530376] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v4.1 03/08/2024 [ 1700.530378] Workqueue: events work_for_cpu_fn [ 1700.530384] RIP: 0010:delay_halt_tpause+0x16/0x20 [ 1700.530389] Code: cc 0f 1f 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 48 8d 04 37 31 c9 48 89 c2 48 c1 ea 20 66 0f ae f1 <c3> cc cc cc cc 0f 1f 44 00 00 90 90 90 90 90 90 90 90 90 90 90 90 [ 1700.530391] RSP: 0018:ff7ac631db683c28 EFLAGS: 00000202 [ 1700.530392] RAX: 00000318d09beee4 RBX: 00000318d09be713 RCX: 0000000000000000 [ 1700.530393] RDX: 0000000000000318 RSI: 00000000000007d1 RDI: 00000318d09be713 [ 1700.530394] RBP: 00000000000007d1 R08: 0000000000000005 R09: 0000000000000005 [ 1700.530395] R10: ff7ac631db683ad0 R11: ffffffffbe3505c0 R12: 00000000000528a6 [ 1700.530396] R13: 0000000000000001 R14: 0000000000000005 R15: ff2a076a64800000 [ 1700.530396] FS: 0000000000000000(0000) GS:ff2a084c62000000(0000) knlGS:0000000000000000 [ 1700.530397] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1700.530399] CR2: 0000559f142e3180 CR3: 000004b512210005 CR4: 0000000000371ee0 [ 1700.530399] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1700.530400] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400 [ 1700.530401] Call Trace: [ 1700.530402] <IRQ> [ 1700.530405] ? show_trace_log_lvl+0x1c4/0x2df [ 1700.530409] ? show_trace_log_lvl+0x1c4/0x2df [ 1700.530411] ? delay_halt.part.0+0x36/0x60 [ 1700.530412] ? watchdog_timer_fn+0x1b2/0x210 [ 1700.530416] ? __pfx_watchdog_timer_fn+0x10/0x10 [ 1700.530418] ? __hrtimer_run_queues+0x12a/0x2c0 [ 1700.530422] ? hrtimer_interrupt+0xfc/0x210 [ 1700.530424] ? __sysvec_hyperv_stimer0+0x2e/0x60 [ 1700.530428] ? sysvec_hyperv_stimer0+0x6d/0x90 [ 1700.530432] </IRQ> [ 1700.530433] <TASK> [ 1700.530433] ? asm_sysvec_hyperv_stimer0+0x16/0x20 [ 1700.530439] ? delay_halt_tpause+0x16/0x20 [ 1700.530440] ? __pfx_delay_halt_tpause+0x10/0x10 [ 1700.530441] delay_halt.part.0+0x36/0x60 [ 1700.530443] gmc_v9_0_flush_gpu_tlb+0x395/0x5e0 [amdgpu] [ 1700.530662] amdgpu_gmc_flush_gpu_tlb+0xd1/0x270 [amdgpu] [ 1700.530845] ? _raw_spin_unlock+0xa/0x30 [ 1700.530848] ? free_unref_page+0x170/0x1c0 [ 1700.530852] amdgpu_gart_invalidate_tlb.part.0+0x51/0xb0 [amdgpu] [ 1700.531020] amdgpu_gart_unbind+0x90/0xd0 [amdgpu] [ 1700.531181] amdgpu_ttm_backend_unbind+0x64/0xb0 [amdgpu] [ 1700.531341] amdgpu_ttm_tt_unpopulate+0x12/0xc0 [amdgpu] [ 1700.531500] amdttm_tt_unpopulate+0x25/0x70 [amdttm] [ 1700.531506] ttm_device_clear_lru_dma_mappings+0x9e/0xe0 [amdttm] [ 1700.531510] amdttm_device_clear_dma_mappings+0x2a/0x80 [amdttm] [ 1700.531513] amdgpu_device_fini_hw+0x122/0x184 [amdgpu] [ 1700.531787] amdgpu_driver_load_kms.cold+0x18/0x2e [amdgpu] [ 1700.532041] amdgpu_pci_probe+0x1f1/0x410 [amdgpu] [ 1700.532219] local_pci_probe+0x4f/0xa0 [ 1700.532222] work_for_cpu_fn+0x16/0x20 [ 1700.532226] process_one_work+0x1e5/0x3b0 [ 1700.532229] ? __pfx_worker_thread+0x10/0x10 [ 1700.532229] worker_thread+0x1c4/0x3a0 [ 1700.532231] ? __pfx_worker_thread+0x10/0x10 [ 1700.532231] kthread+0xe0/0x100 [ 1700.532235] ? __pfx_kthread+0x10/0x10 [ 1700.532236] ret_from_fork+0x2c/0x50 [ 1700.532241] </TASK> [ 1700.628612] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1700.733352] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1700.838089] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1700.944749] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1701.051383] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1701.158065] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1701.264707] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1701.369432] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1701.474142] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1701.578884] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1701.683575] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1701.788275] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1701.893002] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1701.997690] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1702.102386] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1702.208985] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1702.315637] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1702.422274] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1702.528929] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1702.633635] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1702.738390] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1702.843103] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1702.947810] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1703.052971] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1703.157665] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1703.262384] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1703.367117] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1703.473760] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1703.580399] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1703.687051] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1703.793677] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1703.898391] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1704.003109] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1704.107812] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1704.212546] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1704.317296] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1704.422093] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1704.526859] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1704.631632] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1704.738312] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1704.845032] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1704.951713] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1705.058362] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1705.163141] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1705.267893] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1705.372668] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1705.477417] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1705.582190] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1705.686939] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1705.791719] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1705.896464] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1706.003124] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1706.109805] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1706.216528] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1706.323244] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1706.427983] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1706.532748] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1706.637495] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1706.742263] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1706.847020] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1706.951800] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1707.056571] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1707.161337] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1707.268003] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1707.374674] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1707.481368] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1707.588016] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1707.692769] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1707.797516] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1707.902280] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1708.007051] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1708.111842] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1708.216611] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1708.321376] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1708.426112] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1708.532769] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1708.639478] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1708.746134] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1708.852822] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1708.957572] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1709.062320] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1709.167038] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1709.271778] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1709.376546] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1709.481272] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1709.586028] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1709.690791] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1709.797519] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1709.904183] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1710.010863] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1710.117562] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1710.222363] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1710.327108] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1710.431889] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1710.536663] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1710.641404] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1710.746160] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1710.850916] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1710.955689] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1711.062351] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1711.169031] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1711.275714] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1711.382403] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1711.487154] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1711.591890] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1711.696631] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1711.801370] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1711.906135] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1712.010881] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1712.115620] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1712.220353] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1712.327037] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1712.433668] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1712.540390] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1712.647036] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1712.751762] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1712.856549] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1712.961246] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1713.065969] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1713.170662] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1713.275400] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1713.380090] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1713.484791] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1713.591445] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1713.698096] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1713.804736] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1713.911371] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1714.016084] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1714.120782] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1714.225486] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1714.330168] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1714.434850] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1714.539526] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1714.644225] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1714.748936] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1714.855537] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1714.962152] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1715.068758] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1715.175422] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1715.280143] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1715.384887] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1715.489579] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1715.594308] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1715.699063] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1715.803941] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1715.908674] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1716.013382] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1716.120036] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1716.226674] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1716.333331] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1716.439990] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1716.544781] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1716.649550] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1716.754336] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1716.859091] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1716.963856] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1717.068614] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1717.173358] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1717.278122] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1717.384830] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1717.491531] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1717.598163] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1717.704862] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1717.809600] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1717.914353] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1718.019106] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1718.123859] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1718.228594] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1718.333323] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1718.438071] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1718.542787] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1718.649459] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1718.756109] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1718.862775] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1718.969460] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1719.074214] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1719.178950] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1719.283674] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1719.388450] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1719.493179] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1719.597913] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1719.702630] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1719.807419] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1719.914066] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1720.020772] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1720.127463] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1720.234143] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1720.338884] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1720.443602] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1720.548401] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1720.653129] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1720.757882] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1720.862587] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1720.967339] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1721.072078] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1721.178770] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1721.285477] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1721.392118] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1721.498794] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1721.603520] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1721.708289] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1721.797985] INFO: task systemd-modules:1485 blocked for more than 122 seconds. [ 1721.797993] Tainted: G W OEL X ------- --- 5.14.0-427.65.1.el9_4.x86_64 #1 [ 1721.797996] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 1721.797998] task:systemd-modules state:D stack:0 pid:1485 ppid:1 flags:0x00004006 [ 1721.798003] Call Trace: [ 1721.798006] <TASK> [ 1721.798009] __schedule+0x21b/0x550 [ 1721.798018] schedule+0x2d/0x70 [ 1721.798020] schedule_timeout+0x11f/0x160 [ 1721.798027] ? idr_get_free+0x233/0x2f0 [ 1721.798034] ? __send_ipi_one+0xfa/0x170 [ 1721.798041] __wait_for_common+0x93/0x1d0 [ 1721.798044] ? __pfx_schedule_timeout+0x10/0x10 [ 1721.798048] __flush_work.isra.0+0x160/0x230 [ 1721.798053] ? __pfx_wq_barrier_func+0x10/0x10 [ 1721.798058] work_on_cpu+0x69/0x90 [ 1721.798061] ? __pfx_work_for_cpu_fn+0x10/0x10 [ 1721.798063] ? __pfx_local_pci_probe+0x10/0x10 [ 1721.798068] pci_call_probe+0x12b/0x160 [ 1721.798074] pci_device_probe+0x7c/0x100 [ 1721.798078] ? driver_sysfs_add+0x59/0xc0 [ 1721.798083] really_probe+0xe1/0x390 [ 1721.798087] ? pm_runtime_barrier+0x50/0x90 [ 1721.798093] __driver_probe_device+0xd6/0x130 [ 1721.798096] driver_probe_device+0x1e/0x90 [ 1721.798099] __driver_attach+0xd2/0x1c0 [ 1721.798102] ? __pfx___driver_attach+0x10/0x10 [ 1721.798105] bus_for_each_dev+0x78/0xd0 [ 1721.798109] bus_add_driver+0xc2/0x1f0 [ 1721.798113] driver_register+0x70/0xd0 [ 1721.798117] ? __pfx_init_module+0x10/0x10 [amdgpu] [ 1721.798623] do_one_initcall+0x44/0x210 [ 1721.798630] ? kmalloc_trace+0x25/0xa0 [ 1721.798634] do_init_module+0x5c/0x270 [ 1721.798639] __do_sys_init_module+0x12e/0x1b0 [ 1721.798644] do_syscall_64+0x5c/0x90 [ 1721.798651] ? do_user_addr_fault+0x1d6/0x6a0 [ 1721.798654] ? syscall_exit_to_user_mode+0x19/0x40 [ 1721.798660] ? exc_page_fault+0x62/0x150 [ 1721.798663] entry_SYSCALL_64_after_hwframe+0x77/0xe1 [ 1721.798670] RIP: 0033:0x7f2544688f3e [ 1721.798711] RSP: 002b:00007ffc0c010948 EFLAGS: 00000246 ORIG_RAX: 00000000000000af [ 1721.798715] RAX: ffffffffffffffda RBX: 000055956b763460 RCX: 00007f2544688f3e [ 1721.798717] RDX: 00007f25447b232c RSI: 0000000001b8d8b5 RDI: 00007f253f885010 [ 1721.798719] RBP: 00007f253f885010 R08: 000055956b769710 R09: 0000000001b8d000 [ 1721.798720] R10: 0000000000000005 R11: 0000000000000246 R12: 00007f25447b232c [ 1721.798722] R13: 000055956b7630f0 R14: 0000000000000006 R15: 000055956b76a000 [ 1721.798725] </TASK> [ 1721.813010] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1721.917735] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1722.022468] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1722.127226] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1722.231937] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1722.336683] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1722.443381] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1722.550047] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1722.656729] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1722.763460] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1722.868251] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1722.972974] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1723.077760] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1723.182501] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1723.287280] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1723.392433] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1723.497177] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1723.601950] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1723.708607] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1723.815327] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1723.921987] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1724.028686] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1724.133460] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1724.238248] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1724.343043] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1724.447836] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1724.552637] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1724.657422] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1724.762221] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1724.866981] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1724.973673] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1725.080360] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1725.187062] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1725.293751] [drm:amdgpu_gmc_flush_gpu_tlb [amdgpu]] *ERROR* Timeout waiting for VM flush ACK! [ 1725.294376] amdgpu: probe of 0009:00:00.0 failed with error -22 [ 1725.347607] [drm] amdgpu: ttm finalized [ 3587.403934] SELinux: Context unconfined_u:object_r:invalid_bootcinstall_testlabel_t:s0 is not valid (left unmapped).
- is related to
-
AIPCC-1524 AMD GPUs stopped working entirely on IBM Cloud
-
- Closed
-