-
Bug
-
Resolution: Unresolved
-
Major
-
None
-
None
-
None
-
False
-
-
False
-
-
-
-
Rox Sprint 4.11C
USER PROBLEM
What is the user experiencing as a result of the bug? Include steps to reproduce.
- Running fake-workloads does not result in sensor syncing the pods and deployments. The informer looks stuck. Sensor never sends full sync to Central.
CONDITIONS
What conditions need to exist for a user to be affected? Is it everyone? Is it only those with a specific integration? Is it specific to someone with particular database content? etc.
- Simply: Run fake-workloads.
All commits after PR https://github.com/stackrox/stackrox/pull/18436 are affected, including the 4.10.0 release.
ROOT CAUSE
What is the root cause of the bug?
- Fake workloads are currently broken (sensor informers stuck) due to https://github.com/kubernetes/kubernetes/issues/135895.
- We are affected because we use k8s.io/client-go v0.35.2 which enabled WatchListClient feature gate by default.
- It looks like the "breaking" change was added in: https://github.com/stackrox/stackrox/pull/18436. We didn't discover that as we do not run automated tests for fake workloads.
FIX
How was the bug fixed (this is more important if a workaround was implemented rather than an actual fix)?
- Workaround 1 (
verified - works):
- Disable the feature gate with: `kubectl -n stackrox set env deploy/sensor KUBE_FEATURE_WatchListClient=false`
- Disable the feature gate with: `kubectl -n stackrox set env deploy/sensor KUBE_FEATURE_WatchListClient=false`
- Workaround 2 (
unverified):
- Disable WatchListClient in Go code, only when fake workloads are active (small code change)
- Workaround 3: Wait for upstream fix in client-go v0.36.x
Verifications run:
- Newest not-affected nightly build was `4.10.x-nightly-20260113`. I tested it manually and the fake workloads run correctly.
- Oldest affected nightly build was `4.10.x-nightly-20260114`. I tested it manually and the fake workloads is broken.
- Confirmed that 4.10.0 is affected.