This is a clone of issue
OCPBUGS-1453. The following is the description of the original issue:
Description of problem:
TargetDown alert fired while it shouldn't.
Prometheus endpoints are not always properly unregistered and the alert will therefore think that some Kube service endpoints are down
Version-Release number of selected component (if applicable):
The problem as always been there.
Most of the time Prometheus endpoints are properly unregistered.
Aim here is to get the TargetDown Prometheus expression be more resilient; this can be tested on past metrics data in which the unregistration issue was encountered.
Steps to Reproduce:
TargetDown alert triggered while Kube service endpoints are all up & running.
TargetDown alert should not have been trigerred.