-
Bug
-
Resolution: Done
-
Major
-
None
-
4.16.z
-
Quality / Stability / Reliability
-
False
-
-
None
-
None
-
None
-
None
-
None
-
None
-
Hypershift Sprint 259, Hypershift Sprint 260, Hypershift Sprint 263
-
3
-
None
-
None
-
None
-
None
-
None
-
None
-
None
- Are we aware of a misfiring alert of 'NoRunningOvnControlPlane' due to no ServiceMonitor for ovn on the hosted cluster?
Aka we're missing a ServiceMonitor for OVN
oc get servicemonitor -n clusters-yonah NAME AGE catalog-operator 6d23h cluster-version-operator 6d23h etcd 6d23h kube-apiserver 6d23h kube-controller-manager 6d23h monitor-multus-admission-controller 6d23h monitor-ovn-control-plane-metrics 6d23h node-tuning-operator 6d23h olm-operator 6d23h openshift-apiserver 6d23h openshift-controller-manager 6d23h openshift-route-controller-manager 6d23h
oc get prometheusrule master-rules -o yaml -n clusters-yonah | grep NoRunningOvnControlPlane -A 14 - alert: NoRunningOvnControlPlane annotations: description: | Networking control plane is degraded. Networking configuration updates applied to the cluster will not be implemented while there are no OVN Kubernetes pods. runbook_url: https://github.com/openshift/runbooks/blob/master/alerts/cluster-network-operator/NoRunningOvnMaster.md summary: There is no running ovn-kubernetes control plane. expr: | absent(up{job="ovnkube-control-plane", namespace="clusters-yonah"} == 1) for: 5m labels: namespace: clusters-yonah severity: critical - alert: NoOvnClusterManagerLeader annotations: iwatson@iwatson-mac ~ % oc get prometheusrule master-rules -o yaml | grep NoRunningOvnControlPlane -A 12 - alert: NoRunningOvnControlPlane annotations: description: | Networking control plane is degraded. Networking configuration updates applied to the cluster will not be implemented while there are no OVN Kubernetes pods. runbook_url: https://github.com/openshift/runbooks/blob/master/alerts/cluster-network-operator/NoRunningOvnMaster.md summary: There is no running ovn-kubernetes control plane. expr: | absent(up{job="ovnkube-control-plane", namespace="clusters-yonah"} == 1) for: 5m labels: namespace: clusters-yonah severity: critical
This is causing the alert NoRunningOvnControlPlane to fire despite the OVN control plane being available.
oc get pods -n clusters-yonah | grep ovn ovnkube-control-plane-58659f5cd9-2sc2m 3/3 Running 1 (16h ago) 6d23h ovnkube-control-plane-58659f5cd9-57q24 3/3 Running 0 6d23h
- duplicates
-
OCPBUGS-27476 NoRunningOvnControlPlane alert fired after user workload monitoring is enabled on hypershift management cluster
-
- ASSIGNED
-