-
Bug
-
Resolution: Unresolved
-
Undefined
-
None
-
rhel-8.10, rhel-10.1, rhel-9.7
-
None
-
None
-
Important
-
rhel-systemd
-
None
-
False
-
False
-
-
None
-
None
-
None
-
None
-
Unspecified
-
Unspecified
-
Unspecified
-
None
What were you trying to do that didn't work?
A customer hits a kernel panic when updating the system due to PID 1 crashing in the linker phase, before main() executes.
This results in getting file system corruption because most of the updated files didn't reach persistent storage yet but are just in the file system cache, making the system unusable and hard to recover.
After much digging, it was found that the crash was due to having a rogue symlink /usr/lib64/libcrypto.so.10 pointing to an older libcrypto library, causing updated systemd to crash in linker phase, due to not being able to resolve symbols.
Since PID 1 is very important and crashes are just not acceptable, it needs to be hardened to survive as much as possible.
To do so, I'm proposing that before reexecuting, a test is made through spawning systemd as a child and verifying that the child returns 0.
If it appears the child fails to execute, then systemd should cancel the reexec to avoid crashing the system.
I'm attaching a prototype, which for now has several caveats (see the comments in the patch).
What is the impact of this issue to you?
Customer system breaks when updating many packages
Please provide the package NVR for which the bug is seen:
All systemd releases including Upstream.
How reproducible is this bug?
Always through using the minimal reproducer below (not mimicing customer's real case, just for demonstration purposes)
Steps to reproduce
- Rename /usr/lib64/libcrypto.so.3 into /usr/lib64/libcrypto.so.3.orig
# mv /usr/lib64/libcrypto.so.3 /usr/lib64/libcrypto.so.3.orig # Send TERM to PID1 {code:java}# kill -TERM 1
Expected results
No panic
Actual results
Panic:
[ 20.090735] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00007f00 [ 20.091281] CPU: 3 PID: 1 Comm: systemd Kdump: loaded Not tainted 5.14.0-570.64.1.el9_6.x86_64 #1 [ 20.091874] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.17.0-7.fc42 06/10/2025 ...
Additional infos
To recover the crashed system (if the "mv" made it to persistent storage, which is usually not the case), boot with init=/bin/sh on the kernel command line and restore the symlink:
# mount -o rw,remount / # mv /usr/lib64/libcrypto.so.3.orig /usr/lib64/libcrypto.so.3 # exec /usr/lib/systemd/systemd