Overview:
See: https://redhat-internal.slack.com/archives/C09LUC1898W/p1761117388509809
for a detailed description of this issue.
TL;DR;
The change https://github.com/stackrox/stackrox/pull/14747 puts the logic to map policy to categories they are assigned to within the Cursor query to get all policies. When many sensors reconnect they try to fetch the policy. In the specific case of the CS incident this leads to 134 (secured clusters) * 150 (policies) DB requests within the cursor transaction. This leads to connection pool exhaustion.
Suggested Solution
- Put the logic to fill category names out of the Policy walk again
This needs to be back ported to 4.8 and 4.9
- is caused by
-
ROX-31448 central: excessive use of connections and ignores transactions
-
- In Progress
-