-
Bug
-
Resolution: Done
-
Minor
-
None
-
Quality / Stability / Reliability
-
2
-
False
-
-
False
-
-
-
GH Train-36
-
None
Multiple integration test suites have flaky failures due to race conditions, missing CRD optimization, and nil-safety issues in test teardown.
Global Hub 1.7.0
Yes - intermittent failures in CI, especially under load.
Run integration tests repeatedly:
make integration\-test
1. agent/status tests fail due to race condition on map access (concurrent goroutine writes to map[string]*cloudevents.Event while tests read)
2. operator/controllers transporter test fails with 409 Conflict (concurrent Kafka CR modification)
3. envtest suites fail with CRD install timeouts when loading all 35+ CRDs unnecessarily
4. AfterSuite panics on nil testPostgres when BeforeSuite fails
All integration tests pass reliably without flaky failures.
Fix PR: https://github.com/stolostron/multicluster-global-hub/pull/2286
Changes:
- Replace map with sync.Map for concurrent event access in agent/status tests
- Wrap transporter EnsureKafka in Eventually retry block for 409 conflicts
- Optimize each test suite to load only required CRDs instead of full directory (35+ CRDs)
- Add nil guards for testPostgres in AfterSuite across 4 operator/manager suites
- Remove StartEnvTestWithRetry utility in favor of CRD optimization approach
Generated with [Claude Code|https://claude.ai/claude-code]