Uploaded image for project: 'RHEL'
  1. RHEL
  2. RHEL-153094

[RFE] Thread pool state mmap file with dsctl and cn=monitor exposure

Linking RHIVOS CVEs to...Migration: Automation ...Sync from "Extern...XMLWordPrintable

    • Icon: Story Story
    • Resolution: Unresolved
    • Icon: Undefined Undefined
    • None
    • None
    • 389-ds-base
    • None
    • None
    • rhel-idm-ds
    • None
    • False
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • None
    • None
    • Unspecified
    • Unspecified
    • Unspecified
    • None

      Goal

      As an administrator whose server is completely unresponsive to LDAP queries due to thread pool saturation, I need a way to see real-time thread pool state without an LDAP connection, so I can diagnose the situation even when cn=monitor is unreachable.

      cn=monitor metrics require a worker thread to serve the LDAP query. During full thread pool saturation a monitoring tool's query sits in the work queue behind everything else.

      Create a MAP_SHARED mmap file slapd-<instance>.threadpool alongside the existing SNMP slapd.stats. Define a thread_pool_mmap_t struct with a version field, max_workers, heartbeat timestamp (CLOCK_MONOTONIC, updated by slapi_eq_repeat_rel callback), server PID, pool-level gauges (current/max work queue, current/max busy workers, ops initiated/completed, connection count), and per-worker slots (state, conn_id, op_id, start_ns).

      Additionally, reserve space in each per-worker slot for backtrace fields (bt_captured, bt_frame_count, bt_timestamp_ns, bt_frames[64]) – they stay zeroed until a later ticket populates them; this avoids a future format migration.

      Each worker writes only to its own slot using atomic stores – no cross-thread contention, no semaphores. Atomic reads are sufficient for diagnostics.

      Open the file with O_NOFOLLOW | O_RDWR | O_CREAT, permissions 0640. No bind DNs or client IPs – just numeric IDs. Unlink on clean shutdown.

      dsctl thread-pool-status opens the file read-only, reads the struct, and formats output: busy/max workers, queue depth, per-worker activity (state, conn_id, op_id, running duration). It warns if the heartbeat is older than 30 seconds. It checks that /proc/<pid>/comm is ns-slapd to guard against PID recycling after a crash.

      The same per-worker data should also be exposed on cn=monitor as a multi-valued attribute (like the existing connection attribute) – one source of truth, two access paths. Add the new attribute to the ACI's targetattr != exclusion list alongside connection.

      Acceptance criteria

      • Verify dsctl <instance> thread-pool-status displays pool-level metrics (busy workers, queue depth, ops, connections) and per-worker activity without using any LDAP connection
      • Verify the command works while the server is under full thread pool saturation (all workers busy, cn=monitor queries timing out)
      • Verify the mmap file is created at startup with correct permissions (0640) and unlinked on clean shutdown
      • Verify stale file detection works: dsctl warns when heartbeat is older than 30 seconds or PID does not match a running ns-slapd process
      • Verify the mmap file is opened with O_NOFOLLOW and cannot be redirected via symlink
      • Verify per-worker activity appears on cn=monitor as a multi-valued attribute
      • Verify the new cn=monitor attribute is excluded from anonymous access via ACI
      • Verify no measurable performance regression on the write path

              idm-ds-dev-bugs IdM DS Dev
              spichugi@redhat.com Simon Pichugin
              IdM DS Dev IdM DS Dev
              IdM DS QE IdM DS QE
              Evgenia Martyniuk Evgenia Martyniuk
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated: