- 
    
Bug
 - 
    Resolution: Done
 - 
    
Major
 - 
    ACM 2.15.0
 
- 
        Quality / Stability / Reliability
 - 
        False
 - 
        
 - 
        False
 - 
        
 - 
        
 
- 
        Important
 
- 
        None
 
Description of problem:
When using multiple workers, we might loose metrics. This seems to be caused by a race-condition causing the different workers to federate the same metrics from Prometheus.
Version-Release number of selected component (if applicable):
ACM 2.12-ACM2.15
How reproducible:
- Most of the time
 
Steps to Reproduce:
- Setup a ACM Hub+spoke
 - Scale up to use more than 1 worker in the observabilityAddonSpec in the MCO CR
 - You may need to restart metric-collector a few times to hit bad state
 - You can view the count(
{cluster="your-cluster-name"}
) to confirm the issue. It should be very stable and not change when the metric collector is restarted
 
Actual results:
- Metrics might be missing
 - The easiest way to confirm this issue if when the multiple workers are sending the exact same number of timeseries, like below (it also seem to always happen on startup):
 
level=debug caller=logger.go:45 ts=2025-10-03T13:45:13.278299233Z shard=3 component=forwarder component=metricsclient timeseriesnumber=13730 level=debug caller=logger.go:45 ts=2025-10-03T13:45:13.318888053Z shard=0 component=forwarder component=metricsclient timeseriesnumber=13730 level=debug caller=logger.go:45 ts=2025-10-03T13:45:13.320064215Z shard=1 component=forwarder component=metricsclient timeseriesnumber=13730 level=debug caller=logger.go:45 ts=2025-10-03T13:45:13.367873111Z shard=2 component=forwarder component=metricsclient timeseriesnumber=15300
Expected results:
- No metrics are lost. The workers produce a different number of timeseries.
 
Additional info:
- is cloned by
 - 
                    
ACM-24921 [2.14] Using multiple metric-collector workers may cause loss of metrics due to race condition
-         
 - Closed
 
 -         
 - 
                    
ACM-24922 [2.13] Using multiple metric-collector workers may cause loss of metrics due to race condition
-         
 - Closed
 
 -         
 - 
                    
ACM-24923 [2.12] Using multiple metric-collector workers may cause loss of metrics due to race condition
-         
 - Closed
 
 -