-
Bug
-
Resolution: Won't Do
-
Undefined
-
None
-
4.13.z, 4.14.0
-
Quality / Stability / Reliability
-
False
-
-
None
-
Moderate
-
No
-
None
-
None
-
Rejected
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
InsightsDisabled/SimpleContentAccessNotAvailable/InsightsRecommendationActive alerts are defined under openshift-insights namespace prometheusrule.
NOTE: InsightsDisabled/SimpleContentAccessNotAvailable alerts have "namespace: openshift-insights" label in prometheusrules file, InsightsRecommendationActive alert does not have
$ oc -n openshift-insights get prometheusrules insights-prometheus-rules -oyaml ... spec: groups: - name: insights rules: - alert: InsightsDisabled annotations: description: 'Insights operator is disabled. In order to enable Insights and benefit from recommendations specific to your cluster, please follow steps listed in the documentation: https://docs.openshift.com/container-platform/latest/support/remote_health_monitoring/enabling-remote-health-reporting.html' summary: Insights operator is disabled. expr: max without (job, pod, service, instance) (cluster_operator_conditions{name="insights", condition="Disabled"} == 1) for: 5m labels: namespace: openshift-insights severity: info - alert: SimpleContentAccessNotAvailable annotations: description: Simple content access (SCA) is not enabled. Once enabled, Insights Operator can automatically import the SCA certificates from Red Hat OpenShift Cluster Manager making it easier to use the content provided by your Red Hat subscriptions when creating container images. See https://docs.openshift.com/container-platform/latest/cicd/builds/running-entitled-builds.html for more information. summary: Simple content access certificates are not available. expr: ' max without (job, pod, service, instance) (max_over_time(cluster_operator_conditions{name="insights", condition="SCAAvailable", reason="NotFound"}[5m]) == 0)' for: 5m labels: namespace: openshift-insights severity: info - alert: InsightsRecommendationActive annotations: description: Insights recommendation "{{ $labels.description }}" with total risk "{{ $labels.total_risk }}" was detected on the cluster. More information is available at {{ $labels.info_link }}. summary: An Insights recommendation is active for this cluster. expr: insights_recommendation_active == 1 for: 5m labels: severity: info
default cluster, no attached PVs for prometheus, this would trigger InsightsRecommendationActive alert, but checked on developer console, Observe -> Alerts tab, InsightsRecommendationActive alert is not there, only found InsightsDisabled/SimpleContentAccessNotAvailable alert.
$ token=`oc create token prometheus-k8s -n openshift-monitoring`
$ oc -n openshift-monitoring exec -c prometheus prometheus-k8s-0 -- curl -k -H "Authorization: Bearer $token" 'https://thanos-querier.openshift-monitoring.svc:9091/api/v1/query?' --data-urlencode 'query=ALERTS{alertname="InsightsRecommendationActive"}' | jq
{
"status": "success",
"data": {
"resultType": "vector",
"result": [
{
"metric": {
"__name__": "ALERTS",
"alertname": "InsightsRecommendationActive",
"alertstate": "firing",
"container": "insights-operator",
"description": "Prometheus metrics data will be lost when the Prometheus pod is restarted or recreated",
"endpoint": "https",
"info_link": "https://console.redhat.com/openshift/insights/advisor/clusters/fe975ccd-5e51-429b-9d65-718a09495195?first=ccx_rules_ocp.external.rules.empty_prometheus_db_volume|PROMETHEUS_DB_VOLUME_IS_EMPTY",
"instance": "10.129.0.16:8443",
"job": "metrics",
"namespace": "openshift-insights",
"pod": "insights-operator-766bfb4974-fllpl",
"prometheus": "openshift-monitoring/k8s",
"service": "metrics",
"severity": "info",
"total_risk": "Low"
},
"value": [
1693554627.507,
"1"
]
}
]
}
}
thanos-querier rules API also shows InsightsRecommendationActive alert is firing, please noted that for InsightsRecommendationActive alert, "namespace": "openshift-insights" label is in rules.alerts.labels section, not under rules.labels section
$ oc -n openshift-monitoring exec -c prometheus prometheus-k8s-0 -- curl -k -H "Authorization: Bearer $token" 'https://thanos-querier.openshift-monitoring.svc:9091/api/v1/rules' | jq ... { "name": "insights", "file": "/etc/prometheus/rules/prometheus-k8s-rulefiles-0/openshift-insights-insights-prometheus-rules-64f80bc9-c9e8-49c5-ade2-cb63133d1555.yaml", "rules": [ { "state": "inactive", "name": "InsightsDisabled", "query": "max without (job, pod, service, instance) (cluster_operator_conditions{condition=\"Disabled\",name=\"insights\"} == 1)", "duration": 300, "labels": { "namespace": "openshift-insights", "prometheus": "openshift-monitoring/k8s", "severity": "info" }, "annotations": { "description": "Insights operator is disabled. In order to enable Insights and benefit from recommendations specific to your cluster, please follow steps listed in the documentation: https://docs.openshift.com/container-platform/latest/support/remote_health_monitoring/enabling-remote-health-reporting.html", "summary": "Insights operator is disabled." }, "alerts": [], "health": "ok", "evaluationTime": 0.000247544, "lastEvaluation": "2023-09-01T07:56:56.694516549Z", "type": "alerting" }, { "state": "firing", "name": "InsightsRecommendationActive", "query": "insights_recommendation_active == 1", "duration": 300, "labels": { "prometheus": "openshift-monitoring/k8s", "severity": "info" }, "annotations": { "description": "Insights recommendation \"{{ $labels.description }}\" with total risk \"{{ $labels.total_risk }}\" was detected on the cluster. More information is available at {{ $labels.info_link }}.", "summary": "An Insights recommendation is active for this cluster." }, "alerts": [ { "labels": { "alertname": "InsightsRecommendationActive", "container": "insights-operator", "description": "Prometheus metrics data will be lost when the Prometheus pod is restarted or recreated", "endpoint": "https", "info_link": "https://console.redhat.com/openshift/insights/advisor/clusters/fe975ccd-5e51-429b-9d65-718a09495195?first=ccx_rules_ocp.external.rules.empty_prometheus_db_volume|PROMETHEUS_DB_VOLUME_IS_EMPTY", "instance": "10.129.0.16:8443", "job": "metrics", "namespace": "openshift-insights", "pod": "insights-operator-766bfb4974-fllpl", "service": "metrics", "severity": "info", "total_risk": "Low" }, "annotations": { "description": "Insights recommendation \"Prometheus metrics data will be lost when the Prometheus pod is restarted or recreated\" with total risk \"Low\" was detected on the cluster. More information is available at https://console.redhat.com/openshift/insights/advisor/clusters/fe975ccd-5e51-429b-9d65-718a09495195?first=ccx_rules_ocp.external.rules.empty_prometheus_db_volume|PROMETHEUS_DB_VOLUME_IS_EMPTY.", "summary": "An Insights recommendation is active for this cluster." }, "state": "firing", "activeAt": "2023-09-01T07:12:56.693852957Z", "value": "1e+00", "partialResponseStrategy": "WARN" } ], "health": "ok", "evaluationTime": 0.000428525, "lastEvaluation": "2023-09-01T07:56:56.694899015Z", "type": "alerting" }, { "state": "inactive", "name": "SimpleContentAccessNotAvailable", "query": "max without (job, pod, service, instance) (max_over_time(cluster_operator_conditions{condition=\"SCAAvailable\",name=\"insights\",reason=\"NotFound\"}[5m]) == 0)", "duration": 300, "labels": { "namespace": "openshift-insights", "prometheus": "openshift-monitoring/k8s", "severity": "info" }, "annotations": { "description": "Simple content access (SCA) is not enabled. Once enabled, Insights Operator can automatically import the SCA certificates from Red Hat OpenShift Cluster Manager making it easier to use the content provided by your Red Hat subscriptions when creating container images. See https://docs.openshift.com/container-platform/latest/cicd/builds/running-entitled-builds.html for more information.", "summary": "Simple content access certificates are not available." }, "alerts": [], "health": "ok", "evaluationTime": 0.00012941, "lastEvaluation": "2023-09-01T07:56:56.694768041Z", "type": "alerting" } ], "interval": 30, "evaluationTime": 0.000823825, "lastEvaluation": "2023-09-01T07:56:56.694507342Z", "limit": 0, "partialResponseStrategy": "ABORT" },
see dev console picture: https://drive.google.com/file/d/1CH8ihLYfJjxU3jbBsbZQWSGjRF5Wz48i/view?usp=sharing
checked, this issue is also exist in 4.13, example: 4.13.0-0.nightly-2023-08-31-163330
Version-Release number of selected component (if applicable):
4.14.0-0.nightly-2023-08-28-154013
How reproducible:
always
Steps to Reproduce:
1. check openshift-insights alerts ondev console Observe -> Alerts tab 2. 3.
Actual results:
firing alert is not shown under dev console Observe -> Alerts tab, just list 2 not firing alerts
Expected results:
firing alert is shown under dev console Observe -> Alerts tab
Additional info: