-
Bug
-
Resolution: Unresolved
-
Major
-
None
-
rhel-9.6
-
Yes
-
Moderate
-
rhel-sst-arch-hw
-
ssg_platform_enablement
-
2
-
False
-
-
No
-
Red Hat Enterprise Linux
-
None
-
None
-
None
-
Unspecified Release Note Type - Unknown
-
-
aarch64
-
None
When running a guest on A64FX, we hit a corruption after several guest reboots. We have at least 2 different reproducers. One where the corruption after more than 1d (RHEL-22598) and the other one (RHEL-67106) where we hit it generally within tens of minutes. After further debug at QEMU level we identified a code section that may be the cause of the corruption in flatview_insert(). If we comment out the memmove call and replace it by individual cell copies, we do not hit the issue anymore.
static void flatview_insert(FlatView *view, unsigned pos, FlatRange *range) { int i = view->nr; if (view->nr == view->nr_allocated) { view->nr_allocated = MAX(2 * view->nr, 10); view->ranges = g_realloc(view->ranges, view->nr_allocated * sizeof(*view->ranges)); } #if 0 memmove(view->ranges + pos + 1, view->ranges + pos, (view->nr - pos) * sizeof(FlatRange)); #else while (i > pos) { view->ranges[i] = view->ranges[i - 1]; i--; } #endif view->ranges[pos] = *range; memory_region_ref(range->mr); ++view->nr; }
So we wonder whether there could be something wrong with the memmove implementation on this A64FX HW. After a dicussion with fweimer@redhat.com, it looks the rhel9 code for the memset/memcpy/memmove selectors in glibc in RHEL 9 check midr for A64FX.
So this Jira ticket is a request to produce a test build with the A64FX string routines ripped out so that glibc would use the generic implementation, just to see if it removes the issue.
- blocks
-
RHEL-67106 Guest crash with single host cpu pinned
- In Progress
- is related to
-
RHEL-22598 qemu-kvm: /builddir/build/BUILD/qemu-8.2.0/include/qemu/int128.h:33: uint64_t int128_get64(Int128): Assertion `r == a' failed
- Planning