-
Bug
-
Resolution: Won't Do
-
Minor
-
None
-
rhel-8.6.0
-
None
-
Low
-
rhel-idm-sssd
-
ssg_idm
-
0
-
False
-
False
-
-
None
-
None
-
None
-
None
-
If docs needed, set a value
-
-
Unspecified
-
None
-
57,005
Description of problem:
The setup consists of an IPA server with 10K users:
$ ldapsearch -xLLL -D"cn=Directory Manager" -W -b "cn=accounts,dc=example,dc=com" uid=* 1.1 | grep -c "^dn: "
Enter LDAP Password:
10006
$
Each user has its own private group.
eg:
ldapsearch -xLLL -D"cn=Directory Manager" -W -b "cn=groups,cn=accounts,dc=example,dc=com" cn=user_1234 1.1 description
Enter LDAP Password:
dn: cn=user_1234,cn=groups,cn=accounts,dc=example,dc=com
description: User private group for user_1234
$
All users belong to the ipausers group:
$ ldapsearch -xLLL -D"cn=Directory Manager" -W -b "cn=ipausers,cn=groups,cn=accounts,dc=example,dc=com" member | grep -c "^member: "
Enter LDAP Password:
10005
$
Almost each user has its own home directory:
$ ls -1 /home | grep -c user_
9903
$
When enumeration is enabled, there are constant paged searches sent to the LDAP server:
$ grep ^enum /etc/sssd/sssd.conf
enumerate = true
$
LDAP access log excerpt:
================================================
[16/Oct/2022:16:48:32.117068689 +0100] conn=4504 op=12 SRCH base="cn=accounts,dc=example,dc=com" scope=2 filter="(&(objectClass=posixAccount)(uid=)(uidNumber=)(gidNumber=*))" attrs="objectClass uid userPassword uidNumber gidNumber gecos homeDirectory loginShell krbPrincipalName cn memberOf ipaUniqueID ipaNTSecurityIdentifier modifyTimestamp entryusn shadowLastChange shadowMin shadowMax shadowWarning shadowInactive shadowExpire shadowFlag krbLastPwdChange krbPasswordExpiration pwdattribute authorizedService accountexpires useraccountcontrol nsAccountLock host logindisabled loginexpirationtime loginallowedtimemap ipaSshPubKey ipaUserAuthType usercertificate;binary mail"
[16/Oct/2022:16:48:32.733454360 +0100] conn=4504 op=12 RESULT err=0 tag=101 nentries=1000 wtime=0.000415409 optime=0.616401060 etime=0.616812792 notes=U,P details="Partially Unindexed Filter,Paged Search" pr_idx=0 pr_cookie=0
[16/Oct/2022:16:48:32.868938487 +0100] conn=4504 op=13 SRCH base="cn=accounts,dc=example,dc=com" scope=2 filter="(&(objectClass=posixAccount)(uid=)(uidNumber=)(gidNumber=*))" attrs="objectClass uid userPassword uidNumber gidNumber gecos homeDirectory loginShell krbPrincipalName cn memberOf ipaUniqueID ipaNTSecurityIdentifier modifyTimestamp entryusn shadowLastChange shadowMin shadowMax shadowWarning shadowInactive shadowExpire shadowFlag krbLastPwdChange krbPasswordExpiration pwdattribute authorizedService accountexpires useraccountcontrol nsAccountLock host logindisabled loginexpirationtime loginallowedtimemap ipaSshPubKey ipaUserAuthType usercertificate;binary mail"
[16/Oct/2022:16:48:33.445037639 +0100] conn=4504 op=13 RESULT err=0 tag=101 nentries=1000 wtime=0.000337243 optime=0.576117464 etime=0.576449937 notes=U,P details="Partially Unindexed Filter,Paged Search" pr_idx=0 pr_cookie=0
================================================
These searches keep ongoing. For instance, more than 30 minutes later:
$ grep "16/Oct/2022:17:27:" access | grep -c "notes=U,P"
24
$
SSSD domain log keeps increasing with the following message:
================================================
...
(2022-10-16 17:32:25): [be[example.com]] [sysdb_create_ts_entry] (0x0040): Error: 17 (File exists)
- ... skipping repetitive backtrace ...
(2022-10-16 17:32:25): [be[example.com]] [sysdb_create_ts_entry] (0x0040): ldb_add failed: [Entry already exists](68)[Entry name=user_<XXX>@example.com,cn=users,cn=example.com,cn=sysdb already exists] - ... skipping repetitive backtrace ...
(2022-10-16 17:32:25): [be[example.com]] [sysdb_create_ts_entry] (0x0040): Error: 17 (File exists)
(2022-10-16 17:32:25): [be[example.com]] [server_setup] (0x1f7c0): Starting with debug level = 0x0070
...
================================================
This will eventually fill all available disk space:
$ df -lk /
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/mapper/rhel_unused-root 22529284 22508108 21176 100% /
$
$ du -sh /var/log/sssd/
11G /var/log/sssd/
$
$ date; service sssd stop ; rm -f /var/lib/sss/db/* /var/log/sssd/* ; service sssd start
Sun Oct 16 16:39:38 IST 2022
Redirecting to /bin/systemctl stop sssd.service
Redirecting to /bin/systemctl start sssd.service
$
$ df -lk /
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/mapper/rhel_unused-root 22529284 11422644 11106640 51% /
$
There is a high CPU usage from SSSD backend and LDAP processes:
================================================
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
257035 root 20 0 667688 130424 21576 R 98.3 1.6 0:09.08 sssd_be
197719 dirsrv 20 0 1492560 295052 62544 S 1.3 3.7 461:33.43 ns-slapd
...
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
197719 dirsrv 20 0 1492560 295064 62544 S 84.4 3.7 461:42.90 ns-slapd
257124 root 20 0 596752 60432 19620 S 20.9 0.8 0:00.94 sssd_be
================================================
SSSD cache is mounted in tmpfs:
$ grep sss /etc/fstab
tmpfs /var/lib/sss/db/ tmpfs size=300M,mode=0700,uid=sssd,gid=sssd,rootcontext=system_u:object_r:sssd_var_lib_t:s0 0 0
$
$ df -lk /var/lib/sss/db
Filesystem 1K-blocks Used Available Use% Mounted on
tmpfs 307200 30608 276592 10% /var/lib/sss/db
$
Version-Release number of selected component (if applicable):
$ cat /etc/redhat-release
Red Hat Enterprise Linux release 8.6 (Ootpa)
$
$ rpm -qa | grep sssd
sssd-2.6.2-4.el8_6.1.x86_64
sssd-client-debuginfo-2.6.2-4.el8_6.1.x86_64
sssd-common-2.6.2-4.el8_6.1.x86_64
sssd-ipa-2.6.2-4.el8_6.1.x86_64
sssd-krb5-2.6.2-4.el8_6.1.x86_64
sssd-debugsource-2.6.2-4.el8_6.1.x86_64
sssd-client-2.6.2-4.el8_6.1.x86_64
sssd-dbus-2.6.2-4.el8_6.1.x86_64
sssd-krb5-common-2.6.2-4.el8_6.1.x86_64
python3-sssdconfig-2.6.2-4.el8_6.1.noarch
sssd-nfs-idmap-2.6.2-4.el8_6.1.x86_64
sssd-tools-2.6.2-4.el8_6.1.x86_64
sssd-kcm-2.6.2-4.el8_6.1.x86_64
sssd-common-pac-2.6.2-4.el8_6.1.x86_64
sssd-ad-2.6.2-4.el8_6.1.x86_64
sssd-ldap-2.6.2-4.el8_6.1.x86_64
sssd-proxy-2.6.2-4.el8_6.1.x86_64
sssd-debuginfo-2.6.2-4.el8_6.1.x86_64
$
How reproducible:
Always.
Steps to Reproduce:
1. Create 10K IPA users
2. Enable SSSD enumeration
3. Restart SSSD
4. Check SSSD and LDAP logs
Actual results:
Constant high CPU usage and LDAP requests.
Expected results:
After collecting the initial data from LDAP and warming its caches, SSSD should perform less LDAP requests.
Additional info:
LDAP IDL scan limit is set to 100K. Paged searches will use the same limit:
$ grep idlistscanlimit /etc/dirsrv/slapd-EXAMPLE-COM/dse.ldif
nsslapd-idlistscanlimit: 100000
nsslapd-pagedidlistscanlimit: 0
$
$ ldapsearch -xLLL -D"cn=Directory Manager" -W -b "fqdn=XXX,cn=computers,cn=accounts,dc=example,dc=com" nsPagedIDListScanLimit
Enter LDAP Password:
dn: fqdn=XXX,cn=computers,cn=accounts,dc=example,dc=com
$
- external trackers