-
Bug
-
Resolution: Unresolved
-
Normal
-
None
-
None
-
Quality / Stability / Reliability
-
False
-
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
Loki logging alerts are not working log type audit. This can be observed when creating an alerting rule with an 'always firing' logging alert with tenantID: audit.This 'always firing' alert evaluates the expression: sum(rate({ log_type="audit" } |= "authorization.k8s.io/decision" |= "allow" [15s] )) > 0.01 - The alert does not appear in Administrator console under Observe > Alerting > Alerts - The alerting rule does not appear in the Administrator console under Observe > Alerting > Alerting Rules - The alert cannot be retrieved from the loki ruler API in the CLI for example: $ curl -s -k -H "Authorization: Bearer $(oc whoami -t)" "https://${LOKI_PUBLIC_URL}/api/logs/v1/application/prometheus/api/v1/alerts" | jq { "status": "success", "data": { "alerts": [] }, "errorType": "", "error": "" } Tested logged in as kubeadmin user
Version-Release number of selected component (if applicable):
OCP 4.17.18, loki 6.2, logging 6.2
How reproducible:
Easily Reproducible
Steps to Reproduce:
1. Installed operators Loki 6.2, Logging 6.2, COO, 1.1.1 2. Using the steps outlined in this article: https://access.redhat.com/articles/7105362, with some modifications. Created a serviceaccount, granted collect and write permissions to this serviceaccount, with the addition of collect and write permission for audit logs ~~~ $ oc adm policy add-cluster-role-to-user cluster-logging-write-audit-logs system:serviceaccount:openshift-logging:collector $ oc adm policy add-cluster-role-to-user collect-audit-logs system:serviceaccount:openshift-logging:collector ~~~ 3. The ClusterLogForwarder requires addition of audit to the pipelines: ~~~ pipelines: - name: default-logstore inputRefs: - application - infrastructure - audit outputRefs: - default-lokistack ~~~ 4. Ensure that the label openshift.io/log-alerting: 'true' has been added to the openshift-logging namespace 5. An AlertingRule is created with an 'always firing' audit logs alert ~~~ apiVersion: loki.grafana.com/v1 kind: AlertingRule metadata: name: test-audit-alert namespace: openshift-logging labels: openshift.io/log-alerting: 'true' spec: groups: - interval: 1m name: TestAuditalert rules: - alert: TestAuditHighAllowRate annotations: description: testing1,2 summary: testing1,2 expr: > sum(rate({ log_type="audit" } |= "authorization.k8s.io/decision" |= "allow" [15s] )) > 0.01 for: 1m labels: severity: critical tenantID: audit ~~~ 6. For comparison, the always firing test app in https://access.redhat.com/articles/7105362 was deployed, and alerts can be viewed in the console. 7. Checking the logs for loki ruler pods using the command ~~~ $ oc logs -n openshift-logging logging-loki-ruler-0 ~~~ Indicates that the Alerting rule is being evaluated and executed: ~~~ level=info ts=2025-06-04T10:01:25.716361696Z caller=compat.go:67 user=audit rule_name=TestAuditHighAllowRate rule_type=alerting query="(sum(rate({log_type=\"audit\"} |= \"authorization.k8s.io/decision\" |= \"allow\"[15s])) > 0.01)" query_hash=2447605017 msg="evaluating rule" level=info ts=2025-06-04T10:01:25.716847811Z caller=engine.go:263 component=ruler evaluation_mode=local org_id=audit msg="executing query" query="(sum(rate({log_type=\"audit\"} |= \"authorization.k8s.io/decision\" |= \"allow\"[15s])) > 0.01)" query_hash=2447605017 type=instant level=info msg="request timings" insight=true source=loki_ruler rule_name=TestAuditHighAllowRate rule_type=alerting total=0.033403566 total_bytes=2728132 query_hash=2447605017 level=info ts=2025-06-04T10:01:25.750318662Z caller=metrics.go:237 component=ruler evaluation_mode=local org_id=audit latency=fast query="(sum(rate({log_type=\"audit\"} |= \"authorization.k8s.io/decision\" |= \"allow\"[15s])) > 0.01)" query_hash=2447605017 query_type=metric range_type=instant length=0s start_delta=34.974241ms end_delta=34.974371ms step=0s duration=33.403566ms status=200 limit=0 returned_lines=0 throughput=82MB total_bytes=2.7MB total_bytes_structured_metadata=13kB lines_per_second=44037 total_lines=1471 post_filter_lines=1429 total_entries=1 store_chunks_download_time=0s queue_time=0s splits=0 shards=0 query_referenced_structured_metadata=false pipeline_wrapper_filtered_lines=0 chunk_refs_fetch_time=33.036859ms cache_chunk_req=0 cache_chunk_hit=0 cache_chunk_bytes_stored=0 cache_chunk_bytes_fetched=0 cache_chunk_download_time=0s cache_index_req=0 cache_index_hit=0 cache_index_download_time=0s cache_stats_results_req=0 cache_stats_results_hit=0 cache_stats_results_download_time=0s cache_volume_results_req=0 cache_volume_results_hit=0 cache_volume_results_download_time=0s cache_result_req=0 cache_result_hit=0 cache_result_download_time=0s cache_result_query_length_served=0s cardinality_estimate=0 ingester_chunk_refs=0 ingester_chunk_downloaded=0 ingester_chunk_matches=3 ingester_requests=2 ingester_chunk_head_bytes=325kB ingester_chunk_compressed_bytes=393kB ingester_chunk_decompressed_bytes=2.4MB ingester_post_filter_lines=1429 congestion_control_latency=0s index_total_chunks=0 index_post_bloom_filter_chunks=0 index_bloom_filter_ratio=0.00 index_used_bloom_filters=false index_shard_resolver_duration=0s disable_pipeline_wrappers=false has_labelfilter_before_parser=false ~~~ Pulling the alert from the loki ruler API does not return the alert. $ export LOKI_PUBLIC_URL=$(oc get route logging-loki -o jsonpath="{.spec.host}") $ curl -s -k -H "Authorization: Bearer $(oc whoami -t)" "https://${LOKI_PUBLIC_URL}/api/logs/v1/application/prometheus/api/v1/alerts" | jq
Actual results:
- The alert does not appear in Administrator console (OR in the developer view) under
Observe > Alerting > Alerts
- The alerting rule does not appear in the Administrator console under
Observe > Alerting > Alerting Rules
- The alert cannot be retrieved from the loki ruler API in the CLI (for example, when there are no other firing alerts response is empty):
$ curl -s -k -H "Authorization: Bearer $(oc whoami -t)" "https://${LOKI_PUBLIC_URL}/api/logs/v1/application/prometheus/api/v1/alerts" | jq
{
"status": "success",
"data":
,
"errorType": "",
"error": ""
}
Expected results:
Additional info: