-
Bug
-
Resolution: Duplicate
-
Normal
-
rhel-10.2
-
Yes
-
Important
-
1
-
rhel-virt-hwe-arm-1
-
23
-
27
-
None
-
QE ack, Dev ack
-
False
-
False
-
-
No
-
Split items
-
None
-
-
None
-
Unspecified Release Note Type - Unknown
-
Unspecified
-
Unspecified
-
Unspecified
-
-
aarch64
-
None
-
Merge Request passes all submitter checks, Merge Request finished CI testing, Merge Request passed CI testing, Merge Request approved by peer review
What were you trying to do that didn't work?
The qemu-kvm process on the host is terminated by a SIGBUS signal during non-fatal memory error injection, causing the guest VM to reboot.
What is the impact of this issue to you?
The memory RAS feature is broken
Please provide the package NVR for which the bug is seen:
Using RHEL-10.2-20251118.1 BaseOS aarch64
qemu-kvm-10.1.0-8.el10.aarch64
libvirt-11.10.0-1.el10.aarch64
6.12.0-171.el10.aarch64 (both 4k and 64k)
How reproducible is this bug?:
100%
Steps to reproduce
- Many steps, follow the document. The failing case is 0x10. https://docs.google.com/document/d/1vboOkC7I8WlTItgKpSDwuegY3HniCkh7G6YY5NA2m7Q/edit?tab=t.0#heading=h.rrbpzx9u90i5
[root@ampere-mtsnow-altramax-37 /]# ./einj.sh 0x801540f7000 0x10
Expected results
The injected memory error is detected by QEMU and reported to the guest kernel without causing the guest to reboot
Actual results
The injected memory error caused the guest to reboot
Key lines from running `dmesg -w` on the host machine:
[ 874.423343] EDAC MC0: 1 UE multi-bit ECC on unknown memory (node:0 card:2 page:0x801540f7 offset:0x0 grain:1 - APEI location: node:0 card:2 status(0x0000000000000400): Storage error in DRAM memory)
[ 874.446545] Memory failure: 0x801540f7: Sending SIGBUS to qemu-kvm:4470 due to hardware memory corruption
[ 874.456118] Memory failure: 0x801540f7: recovery action for dirty LRU page: Recovered
- relates to
-
RHEL-135143 [RHEL10.2] Guest kernel crashes due to memory error injection
-
- Integration
-