Loading...

Linking RHIVOS CVEs to...

Migration: Automation ...

RHELPRIO AssignedTeam ...

SWIFT: POC Conversion

Sync from "Extern...

XML

Word

Printable

Type: Bug
Resolution: Done-Errata
Priority: Major
Fix Version/s: rhel-10.0.z
Affects Version/s: rhel-9.4
Component/s: sssd
Labels:
- 0day
- SSSD-POST

Fixed in Build:
sssd-2.10.2-3.el10_0.2
Regression:
No
Severity:
Important
Keywords:

0day

AssignedTeam:
rhel-idm-sssd
Sub-System Group:

ssg_idm

Story Points:
9
Blocked:
False
Ready:
False
Blocked Reason:

Hide

None

Show
None
Product Documentation Required:
None
Sprint:
None

Git Pull Request:
https://github.com/SSSD/sssd/pull/7841
Preliminary Testing:
Pass
Errata Link:
https://errata.engineering.redhat.com/advisory/148258
Test Coverage:

RegressionOnly

ProdDocsReview-CCS:
Unspecified
ProdDocsReview-Dev:
Unspecified
ProdDocsReview-QE:
Unspecified

Experience:
Architecture:

All

PX Impact Score:
SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

Planning:
None

What were you trying to do that didn't work?

In a relatively large AD deployment with provider = ldap and are severely affected by this bug: https://bugzilla.redhat.com/show_bug.cgi?id=1886492

What is the impact of this issue to you?

Tried to mitigate the issue with:

ignore_group_members = true
lower values of entry_cache_timeout, entry_cache_user_timeout, entry_cache_group_timeout
lower value of ldap_purge_cache_timeout in conjunction with 2.
ldap_group_search_base filtering when possible
Despite this we are still running into cases where certain hosts that see access from a higher number of users (nfs servers) grow the database too quickly despite the optimizations above, this is the current performance for a user lookup when memcache expires, and it gets progressively worse until it can't return queries anymore:

id user, db 22M -> 7.0s
id user, db 43M -> 14s
id user, db 100M -> 30s

Please provide the package NVR for which the bug is seen:

yum list installed | grep sssd
python3-sssdconfig.noarch 2.9.4-6.el9_4.1 @BaseOS
sssd.x86_64 2.9.4-6.el9_4.1 @BaseOS
sssd-ad.x86_64 2.9.4-6.el9_4.1 @BaseOS
sssd-client.x86_64 2.9.4-6.el9_4.1 @BaseOS
sssd-common.x86_64 2.9.4-6.el9_4.1 @BaseOS
sssd-common-pac.x86_64 2.9.4-6.el9_4.1 @BaseOS
sssd-dbus.x86_64 2.9.4-6.el9_4.1 @BaseOS
sssd-ipa.x86_64 2.9.4-6.el9_4.1 @BaseOS
sssd-kcm.x86_64 2.9.4-6.el9_4.1 @BaseOS
sssd-krb5.x86_64 2.9.4-6.el9_4.1 @BaseOS
sssd-krb5-common.x86_64 2.9.4-6.el9_4.1 @BaseOS
sssd-ldap.x86_64 2.9.4-6.el9_4.1 @BaseOS
sssd-nfs-idmap.x86_64 2.9.4-6.el9_4.1 @BaseOS
sssd-proxy.x86_64 2.9.4-6.el9_4.1 @BaseOS
sssd-tools.x86_64 2.9.4-6.el9_4.1 @BaseOS

How reproducible is this bug?:

Steps to reproduce

We were counting on purging the disk cache frequently enough with ldap_purge_cache_timeout, but we found that is that once the db reaches a certain size, the ldap purge process is unable to complete (as if it times out, there does not seem to be any detailed information even with the highest debug level on the ldap backend). So the db does not shrink, and the purge process also is a blocking operation that hangs queries while it runs, so running it frequently is less than ideal.

Ultimately with db growing further, sssd becomes unresponsive and the only way to recover is to delete the disk cache manually and restart the service.

We understand that the disk cache performance might be related to missing indexes as specified in https://bugzilla.redhat.com/show_bug.cgi?id=1886492 but it's not clear why this was marked as CLOSED WONTFIX or if there is a plan to resolve.

Expected results:

Would it be acceptable to have an option to disable the disk cache completely and rely exclusively on memcache, but if that is not supported currently? Or if the cache purge timeout issue can be resolved that would also help.

links to

Original RHBZ

RHBA-2025:148258 sssd update

Upstream ticket

Assignee:: Alexey Tikhonov

Reporter:: Shajith Arul Simon

Developer:: Alexey Tikhonov

QA Contact:: Shridhar Gadekar

Doc Contact:: Louise McGarry

Votes:: 0 Vote for this issue

Watchers:: 17 Start watching this issue

Created:: 2025/02/13 8:15 AM

Updated:: 2025/09/13 3:34 PM

Resolved:: 2025/05/13 4:03 PM

Next Planned Release Date:: 2025/05/13

Release Date:: 2025/05/13

Details

Description

What were you trying to do that didn't work?

What is the impact of this issue to you?

Please provide the package NVR for which the bug is seen:

How reproducible is this bug?:

Steps to reproduce

Expected results:

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates

Hide