-
Bug
-
Resolution: Done-Errata
-
Major
-
rhel-9.4
-
389-ds-base-2.6.1-1.el9
-
No
-
Important
-
ZStream
-
rhel-idm-ds
-
ssg_idm
-
26
-
0
-
False
-
False
-
-
Yes
-
None
-
Approved Blocker
-
Pass
-
RegressionOnly
-
Bug Fix
-
-
Done
-
-
x86_64
-
None
What were you trying to do that didn't work?
This issue typically happens right after an IPA replica is deleted ( ipa server-del ).
The deletion process triggers the purging of replicaID(s) used by the removed replica.
The purging thread uses a high amount of CPU and the LDAP won't respond to requests.
Killing the LDAP server is sometimes the only option to recover.
A few customers are noticing this behaviour ( a common pattern is the RHEL version that is 9.4 ).
A couple of stacktraces from 2 different servers:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 1033655 dirsrv 20 0 1805076 546380 327680 R 99.9 9.4 2:25.51 ns-slapdThread 43 (Thread 0x7f0ba44af640 (LWP 1033655) "ns-slapd"): #0 0x00007f0bc6015c75 in __db_tas_mutex_lock_int () at target:/lib64/libdb-5.3.so #1 0x00007f0bc60d07fc in __db_cursor_int () at target:/lib64/libdb-5.3.so #2 0x00007f0bc60d3481 in __dbc_idup () at target:/lib64/libdb-5.3.so #3 0x00007f0bc60d3cb6 in __dbc_iget () at target:/lib64/libdb-5.3.so #4 0x00007f0bc60e25e1 in __dbc_get_pp () at target:/lib64/libdb-5.3.so #5 0x00007f0bc623e304 in bdb_dblayer_cursor_iterate () at target:/usr/lib64/dirsrv/plugins/libback-ldbm.so #6 0x00007f0bc815995c in _cl5Iterate () at target:/usr/lib64/dirsrv/plugins/libreplication-plugin.so #7 0x00007f0bc8159e3c in _cl5PurgeRID () at target:/usr/lib64/dirsrv/plugins/libreplication-plugin.so #8 0x00007f0bc815c98e in trigger_cl_purging_thread () at target:/usr/lib64/dirsrv/plugins/libreplication-plugin.so #9 0x00007f0bca76ebd4 in _pt_root () at target:/lib64/libnspr4.so #10 0x00007f0bca089c02 in start_thread () at target:/lib64/libc.so.6 #11 0x00007f0bca10ec40 in clone3 () at target:/lib64/libc.so.6 PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 994148 dirsrv 20 0 1668720 251560 137984 R 99.9 4.3 1:36.64 ns-slapdThread 42 (Thread 0x7fe57ba61640 (LWP 994148) "ns-slapd"): #0 0x00007fe59e01db74 in __bamc_next () at target:/lib64/libdb-5.3.so #1 0x00007fe59e0220dc in __bamc_get () at target:/lib64/libdb-5.3.so #2 0x00007fe59e0d3d4d in __dbc_iget () at target:/lib64/libdb-5.3.so #3 0x00007fe59e0e25e1 in __dbc_get_pp () at target:/lib64/libdb-5.3.so #4 0x00007fe59e23e304 in bdb_dblayer_cursor_iterate () at target:/usr/lib64/dirsrv/plugins/libback-ldbm.so #5 0x00007fe59df5195c in _cl5Iterate () at target:/usr/lib64/dirsrv/plugins/libreplication-plugin.so #6 0x00007fe59df51e3c in _cl5PurgeRID () at target:/usr/lib64/dirsrv/plugins/libreplication-plugin.so #7 0x00007fe59df5498e in trigger_cl_purging_thread () at target:/usr/lib64/dirsrv/plugins/libreplication-plugin.so #8 0x00007fe5a2761bd4 in _pt_root () at target:/lib64/libnspr4.so #9 0x00007fe5a2089c02 in start_thread () at target:/lib64/libc.so.6 #10 0x00007fe5a210ec40 in clone3 () at target:/lib64/libc.so.6
What is the impact of this issue to you?
The LDAP server becomes unresponsive.
Please provide the package NVR for which the bug is seen:
cat etc/redhat-release Red Hat Enterprise Linux release 9.4 (Plow) grep ^ipa installed-rpms ipa-client-4.11.0-15.el9_4.x86_64 Mon Jun 24 12:26:48 2024 ipa-client-common-4.11.0-15.el9_4.noarch Mon Jun 24 12:26:32 2024 ipa-common-4.11.0-15.el9_4.noarch Mon Jun 24 12:26:48 2024 ipa-healthcheck-0.16-3.el9.noarch Mon May 6 11:54:23 2024 ipa-healthcheck-core-0.16-3.el9.noarch Mon May 6 11:53:01 2024 ipa-selinux-4.11.0-15.el9_4.noarch Mon Jun 24 12:26:40 2024 ipa-server-4.11.0-15.el9_4.x86_64 Mon Jun 24 12:27:10 2024 ipa-server-common-4.11.0-15.el9_4.noarch Mon Jun 24 12:26:32 2024 ipa-server-dns-4.11.0-15.el9_4.noarch Mon Jun 24 12:27:12 2024
How reproducible is this bug?:
Quite often on RHEL 9.4.
Steps to reproduce
- Delete an IPA server
- Check the CPU usage of the LDAP server on other replicas in the topology
- Check for the pattern "CleanAllRUV" in the LDAP errors log
Expected results
Working LDAP server.
Actual results
Unresponsive LDAP server.
- relates to
-
RHEL-64854 cleanallruv consums CPU and is slow
-
- Closed
-
- links to
-
RHBA-2024:144130 389-ds-base bug fix and enhancement update