-
Bug
-
Resolution: Unresolved
-
Major
-
None
-
4.20
-
Quality / Stability / Reliability
-
False
-
-
None
-
None
-
None
-
None
-
Proposed
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
The current telco-hub reference-CRs sync-wave configuration contains critical inconsistencies causing deployment failures and race conditions in ArgoCD rollouts.
Critical Problems:
- 42 resources missing sync-wave annotations causing undefined deployment order
- MultiClusterHub deploys before MultiClusterEngine (backward dependency)
- Namespaced resources deploy before their namespaces exist
- Storage consumers deploy before storage validation completes
- Policy execution scattered across random sync-waves
Impact:
- Unreliable deployments requiring manual intervention
- Extended deployment times due to retry loops
- Support escalations for "random" deployment failures
- Reduced confidence in GitOps automation
Proposed Solution:
Implement structured 8-phase sync-wave ordering following Kubernetes native resource dependencies:
- Registry Foundation (-50)
- Namespaces (-45)
- Namespaced Resources (-40)
- ArgoCD Resources (-35)
- Independent Custom Resources (-30)
- Policies and Validation (-25)
- Storage-Dependent Services (-10)
- ZTP Components (100)
Acceptance Criteria:
- [ ] All 68 reference-CRs files have appropriate sync-wave annotations
- [ ] MCE deploys before MCH
- [ ] All namespaces deploy before namespaced resources
- [ ] Storage validation completes before storage consumers deploy
- [ ] Consistent policy execution phase
- [ ] Documentation updated with sync-wave strategy
This should address long-standing deployment reliability issues and establish a foundation for predictable telco-hub rollouts.