-
Bug
-
Resolution: Unresolved
-
Minor
-
None
-
None
-
Quality / Stability / Reliability
-
False
-
-
False
-
-
-
Files:
- fleetshard/pkg/fleetshardmetrics/metrics.go#L104-L112 — SetPauseReconcileStatus adds a time series per instance ID
- fleetshard/pkg/runtime/runtime.go#L293-L306 — deleteStaleReconcilers removes the reconciler from its map but never deletes the corresponding metric label
pauseReconcileInstances is a prometheus.GaugeVec keyed by Central instance ID. A new time series is created whenever a Central is seen for the first time. When Centrals are deleted, deleteStaleReconcilers cleans up the reconciler registry but has no matching call to remove the metric. The label set for every deleted tenant accumulates in memory indefinitely.
The pattern for correct cleanup already exists in the same file: CertificatesExpiry has DeleteCertMetric, DeleteCertNamespaceMetric, and DeleteKeyCertMetric using DeletePartialMatch. pauseReconcileInstances is missing the equivalent.
Network amplification: Not directly affected. The leak rate is determined by tenant churn, not network conditions.
Fix: Add a deletion method (e.g. DeletePauseReconcileMetric(instance string)) that calls pauseReconcileInstances.DeletePartialMatch(...), and call it from deleteStaleReconcilers when removing a key.