-
Bug
-
Resolution: Unresolved
-
Normal
-
rhel-10.2
-
Yes
-
Important
-
1
-
rhel-virt-hwe-arm-1
-
23
-
27
-
None
-
QE ack, Dev ack
-
False
-
False
-
-
No
-
Split items
-
Pass
-
-
None
-
Unspecified Release Note Type - Unknown
-
Unspecified
-
Unspecified
-
Unspecified
-
-
aarch64
-
None
-
Merge Request passes all submitter checks, Merge Request finished CI testing, Merge Request passed CI testing, Merge Request approved by peer review
What were you trying to do that didn't work?
The latest RHEL10.2 guest kernel crashes due to memory error injection
What is the impact of this issue to you?
The memory RAS feature is broken
Please provide the package NVR for which the bug is seen:
host: 6.12.0-170.el10.aarch64
guest: 6.12.0-170.el10.aarch64
qemu: qemu-kvm-10.1.0-5.el10
How reproducible is this bug?:
Steps to reproduce
- Provisioning host and guest, both are 6.12.0-170.el10.aarch64
- On the guest, build 'victim' binary
guest$ git clone git://git.kernel.org/pub/scm/utils/cpu/mce/mce-test.git
guest$ cd mce-test/tools; make; make install
guest$ cp ../bin/victim ~/victim - On the host, build 'test' binary (source code attached)
host$ gcc test.c -o test; cp test ~/test - Start the guest _with_ 4GB memory and one NUMA node
/home/gavin/sandbox/qemu.rhel/build/qemu-system-aarch64 \
-accel kvm -machine virt-rhel10.2.0,gic-version=host,nvdimm=on,ras=on \
-cpu host -smp maxcpus=8,cpus=8,sockets=2,clusters=2,cores=2,threads=1 \
-m 4096M,slots=16,maxmem=128G \
-object memory-backend-ram,id=mem0,size=4096M \
-numa node,nodeid=0,cpus=0-7,memdev=mem0 \
-L /home/gavin/sandbox/qemu.rhel/build/pc-bios \
-monitor none -serial mon:stdio -nographic -gdb tcp::6666 \
-qmp tcp:localhost:5555,server,wait=off \
-bios /home/gavin/sandbox/qemu.rhel/build/pc-bios/edk2-aarch64-code.fd \
-boot c \
-device pcie-root-port,bus=pcie.0,chassis=1,id=pcie.1 \
-device pcie-root-port,bus=pcie.0,chassis=2,id=pcie.2 \
-drive file=/home/gavin/sandbox/images/disk.qcow2,if=none,id=drive0 \
-device virtio-blk-pci,id=virtblk0,bus=pcie.1,drive=drive0,num-queues=4 \
-netdev tap,id=tap1,vhost=true,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown \
-device virtio-net-pci,bus=pcie.2,netdev=tap1,mac=52:54:00:f1:26:b0 - On guest, execute '~/victim -d'
guest$ ~/victim -d
physical address of (0xffff96a1e000) = 0x126002000
Hit any key to trigger error: - On host, execute '~/test 0x126002000'
host$ ~/test 0x126002000 - On guest, press enter key to continue the execution of 'victim', then the guest
kernel crashes and the following kernel log is found from '/var/crash/xxxx'.
[ 209.148986] Unable to handle kernel write to read-only memory at virtual address ffff800080065008
[ 209.148991] Mem abort info:
[ 209.148992] ESR = 0x000000009600004f
[ 209.148993] EC = 0x25: DABT (current EL), IL = 32 bits
[ 209.148995] SET = 0, FnV = 0
[ 209.148996] EA = 0, S1PTW = 0
[ 209.148996] FSC = 0x0f: level 3 permission fault
[ 209.148997] Data abort info:
[ 209.148998] ISV = 0, ISS = 0x0000004f, ISS2 = 0x00000000
[ 209.148999] CM = 0, WnR = 1, TnD = 0, TagAccess = 0
[ 209.149000] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[ 209.149001] swapper pgtable: 4k pages, 48-bit VAs, pgdp=00000000dbcde000
[ 209.149003] [ffff800080065008] pgd=10000001001d9403, p4d=10000001001d9403, pud=10000001001da403, pmd=10000001001db403, pte=006000013c750f83
[ 209.149007] Internal error: Oops: 000000009600004f 1 SMP
[ 209.149010] Modules linked in: rfkill nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables vfat fat nfit libnvdimm fuse loop vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vmw_vmci vsock xfs nvme_tcp nvme_fabrics nvme_core nvme_keyring nvme_auth crct10dif_ce virtio_net ghash_ce sha2_ce sha256_arm64 net_failover sha1_ce failover virtio_blk dm_mirror dm_region_hash dm_log dm_mod nfnetlink
[ 209.149040] CPU: 3 UID: 0 PID: 1857 Comm: victim Kdump: loaded Not tainted 6.12.0-170.el10.aarch64 #1 PREEMPT(voluntary)
[ 209.149043] Hardware name: Red Hat KVM, BIOS edk2-stable202408-prebuilt.qemu.org 08/13/2024
[ 209.149044] pstate: 604001c5 (nZCv dAIF +PANUAO -TCO -DIT -SSBS BTYPE=-)
[ 209.149046] pc : acpi_os_write_memory+0x130/0x1a0
[ 209.149052] lr : acpi_os_write_memory+0x2c/0x1a0
[ 209.149054] sp : ffff80008866bc50
[ 209.149055] x29: ffff80008866bc50 x28: ffff0000c8cac440 x27: 00000000000000c4
[ 209.149057] x26: ffffc0d6d64d9298 x25: ffffc0d6d48a7688 x24: ffff800080695018
[ 209.149059] x23: ffff80008866bd14 x22: 0000000000000008 x21: 0000000000000040
[ 209.149061] x20: 0000000000000001 x19: 000000013c750008 x18: 0000000000000000
[ 209.149063] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
[ 209.149064] x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000
[ 209.149067] x11: 0000000000000000 x10: 0000000000000000 x9 : ffffc0d6d3f94cdc
[ 209.149068] x8 : 0000000000000020 x7 : 000000013c750008 x6 : ffffc0d6d6138fc0
[ 209.149070] x5 : 000000013c751000 x4 : 0000000000000008 x3 : ffff0000c0c50960
[ 209.149072] x2 : 0000000000000040 x1 : ffff0000c8cac440 x0 : ffff800080065008
[ 209.149074] Call trace:
[ 209.149075] acpi_os_write_memory+0x130/0x1a0 (P)
[ 209.149078] apei_write+0xcc/0xe8
[ 209.149082] ghes_clear_estatus.part.0+0xc8/0xe0
[ 209.149084] ghes_in_nmi_queue_one_entry+0x1e4/0x330
[ 209.149086] ghes_notify_sea+0x60/0x110
[ 209.149088] apei_claim_sea+0xa4/0x310
[ 209.149090] do_sea+0xa8/0xd0
[ 209.149093] do_mem_abort+0x48/0xa0
[ 209.149095] el0_da+0x48/0x160
[ 209.149099] el0t_64_sync_handler+0xd0/0xf0
[ 209.149101] el0t_64_sync+0x1ac/0x1b0
[ 209.149104] Code: 17ffffeb 710102bf 54000341 d50332bf (f9000014)
[ 209.149107] SMP: stopping secondary CPUs
[ 209.149774] Starting crashdump kernel...
[ 209.149775] Bye!
Expected results
The injected memory error is detected by QEMU and reported to the guest kernel without causing a guest kernel crash
Actual results
The injected memory error is detected by QEMU, but caused the guest kernel crash
- is related to
-
RHEL-135457 [RHEL10.2] Guest gets rebooted due to memory error injection
-
- Closed
-
- links to