-
Story
-
Resolution: Unresolved
-
Undefined
-
None
-
None
-
None
-
None
-
rhel-idm-ds
-
None
-
False
-
False
-
-
None
-
None
-
None
-
None
-
Unspecified
-
Unspecified
-
Unspecified
-
None
Goal
As a support engineer who needs to see where threads are stuck in the code, I need a non-intrusive alternative to pstack that captures backtraces without freezing the process, so I can diagnose stuck threads on a production server under load.
Per-worker activity from RHEL-153094 tells you what each thread is doing (SEARCH on conn 1234 for 8 seconds) but not where in the code it's stuck. Today the only answer is pstack or gdb, both of which freeze the entire process via ptrace.
dsctl thread-pool-backtrace uses SIGUSR1 to trigger each worker to capture its own backtrace into its reserved mmap slot. No process freeze – threads are interrupted by the signal, capture the stack, and resume immediately.
We need to repurpose the SIGUSR1 handler into a backtrace handler. The handler must distinguish between the initial process-level signal (start coordination) and per-worker signals (capture backtrace) – for example by checking whether the receiving thread is a worker thread or not.
Acceptance criteria
- Verify dsctl <instance> thread-pool-backtrace produces per-thread backtraces with resolved function names and file locations
- Verify stuck threads (e.g., blocked in a plugin or on a mutex) are captured correctly – the backtrace shows the blocking call
- Verify idle threads show the expected connection_wait_for_new_work call chain
- Verify the process does not freeze during backtrace capture – concurrent operations continue without interruption
- Verify rate limiting: a second call within 5 seconds returns the previous backtrace data instead of re-triggering capture
- Verify PID validation: dsctl refuses to send the signal if /proc/<pid>/comm does not match ns-slapd
- Verify workers that don't respond within the timeout are reported as "unresponsive" rather than causing dsctl to hang
- Verify correct memory ordering: dsctl never reads incomplete frame data (acquire/release semantics on bt_captured)