-
Story
-
Resolution: Unresolved
-
Normal
-
None
-
False
-
-
False
-
-
As a cluster admin, I want the operator to automatically detect and recreate missing CA bundle configmaps in user namespaces, so that workloads continue to function correctly even if configmaps are accidentally deleted.
Acceptance Criteria
- Test that operator detects missing config-trusted-cabundle configmap even when namespace label indicates reconciliation is complete
- Test that operator detects missing config-service-cabundle configmap even when namespace label indicates reconciliation is complete
- Verify that operator recreates both configmaps when either is missing
- Verify that operator logs a warning message when missing configmaps are detected
- Test that reconciliation behavior matches existing RBAC self-healing (checks RoleBinding existence)
- Verify that namespace label remains accurate after configmap recreation
- Test that self-healing works after upgrade, manual deletion, and other scenarios
Problem Context
Currently, the operator uses label operator.tekton.dev/namespace-trusted-ca-config: "X.XX.X" to track which namespaces have CA bundles configured. Once this label matches the current operator version, the operator skips CA bundle reconciliation for that namespace permanently.
If the configmaps are subsequently deleted (manually, or by external processes), the operator never recreates them because it only checks the label, not the actual existence of the configmaps.
This differs from RBAC reconciliation, which includes self-healing checks:
// RBAC has self-healing (rbac.go:312-320) if ns.Labels[namespaceVersionLabel] == r.version { // Even if label matches, verify RoleBinding exists _, err := r.kubeClientSet.RbacV1().RoleBindings(ns.Name).Get(...) if errors.IsNotFound(err) { needsRBAC = true // Re-reconcile if missing! } }
CA bundles lack this verification (rbac.go:332-336):
// NO self-healing check if ns.Labels[namespaceTrustedConfigLabel] != r.version { result.CANamespaces = append(result.CANamespaces, ns) }
Proposed Implementation
Add self-healing check similar to RBAC in getNamespacesToBeReconciled():
needsCABundle := false if ns.Labels[namespaceTrustedConfigLabel] != r.version { needsCABundle = true } else { // Self-healing: Verify configmaps exist even when label matches _, err1 := r.kubeClientSet.CoreV1().ConfigMaps(ns.Name).Get(ctx, trustedCAConfigMapName, metav1.GetOptions{}) _, err2 := r.kubeClientSet.CoreV1().ConfigMaps(ns.Name).Get(ctx, serviceCAConfigMapName, metav1.GetOptions{}) if errors.IsNotFound(err1) || errors.IsNotFound(err2) { logger.Infof("CA bundle configmaps missing in namespace %s, will reconcile", ns.Name) needsCABundle = true } else if err1 != nil || err2 != nil { return nil, fmt.Errorf("error checking configmaps in namespace %s: %w", ns.Name, err) } } if needsCABundle { logger.Debugf("Adding namespace for CA bundle reconciliation: %s", ns.GetName()) result.CANamespaces = append(result.CANamespaces, ns) }
Customer Impact
Customer reported missing configmaps after upgrade from 1.19 to 1.20. Must-gather analysis shows:
- Operator logs "No namespaces need reconciliation" 4,687 times over 42.5 hours
- ZERO namespace processing activity in 65,778 log lines
- All namespaces have label set but configmaps missing
- No errors or warnings indicating the problem
Current workaround: Remove namespace label to force reconciliation:
oc label namespace [namespace-name] operator.tekton.dev/namespace-trusted-ca-config-
Files to Modify
- pkg/reconciler/openshift/tektonconfig/rbac.go - Add self-healing in getNamespacesToBeReconciled() method (lines 332-336)