-
Bug
-
Resolution: Done-Errata
-
Major
-
rhel-8.4.0
-
systemd-239-82.el8
-
None
-
Moderate
-
rhel-sst-cs-plumbers
-
ssg_core_services
-
22
-
28
-
5
-
False
-
-
None
-
Red Hat Enterprise Linux
-
None
-
Approved Exception
-
Pass
-
RegressionOnly
-
None
What were you trying to do that didn't work?
It is the second time I see this in less than a week. Some customers still use an old release of systemd affected by memory corruption (typically KCS https://access.redhat.com/solutions/6369201).
I don't know if it's coincidence, but the last 2 cases I handled showed systemd was not able to enter the freeze loop, because freeze(), first closing all file descriptors, itself hangs because of opendir_tail() allocating memory that ends up in a dead loop:
Core was generated by `/usr/lib/systemd/systemd'. #0 __lll_lock_wait_private () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:63 63 2: movl %edx, %eax (gdb) bt #0 __lll_lock_wait_private () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:63 #1 0x00007f93ccd83c60 in __GI___libc_malloc (bytes=32816) at malloc.c:3071 #2 0x00007f93ccdc24aa in __alloc_dir (fd=3, close_fd=<optimized out>, flags=<optimized out>, statp=0x7ffe6e9665d0) at ../sysdeps/posix/opendir.c:118 #3 0x00007f93ccdc25ad in opendir_tail (fd=3) at ../sysdeps/posix/opendir.c:69 #4 0x00007f93ccdc2664 in __opendir (name=name@entry=0x7f93ce58b951 "/proc/self/fd") at ../sysdeps/posix/opendir.c:92 #5 0x00007f93ce4bcaf2 in close_all_fds (except=except@entry=0x0, n_except=n_except@entry=0) at ../src/basic/fd-util.c:198 #6 0x00007f93ce4e1a16 in freeze () at ../src/basic/process-util.c:986 #7 0x0000563db0d62e21 in freeze_or_reboot () at ../src/core/main.c:154 #8 0x0000563db0d63ec9 in crash (sig=<optimized out>) at ../src/core/main.c:250 #9 <signal handler called> #10 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50 #11 0x00007f93ccd1fdb5 in __GI_abort () at abort.c:79 #12 0x00007f93ccd784e7 in __libc_message (action=action@entry=do_abort, fmt=fmt@entry=0x7f93cce87a0e "%s\n") at ../sysdeps/posix/libc_fatal.c:181 #13 0x00007f93ccd7f5ec in malloc_printerr (str=str@entry=0x7f93cce85b20 "corrupted double-linked list") at malloc.c:5374 #14 0x00007f93ccd7fe2c in unlink_chunk (p=p@entry=0x563db1413e80, av=<optimized out>) at malloc.c:1474 #15 0x00007f93ccd829ab in _int_malloc (av=av@entry=0x7f93cd0bdbc0 <main_arena>, bytes=bytes@entry=97) at malloc.c:4050 #16 0x00007f93ccd833af in _int_realloc (av=av@entry=0x7f93cd0bdbc0 <main_arena>, oldp=oldp@entry=0x563db13e52c0, oldsize=oldsize@entry=64, nb=nb@entry=112) at malloc.c:4612 [...]
IMHO we need to harden this piece of code to avoid the hang, I don't know if this needs to be done on glibc side or systemd side or both.
Even though current systemd is not subjected to known memory corruption issues, this hardening is welcome.
Please provide the package NVR for which bug is seen:
systemd & glibc
How reproducible:
Twice at least during last week