Uploaded image for project: 'Observability UI'
  1. Observability UI
  2. OU-862

Loki logging alerts disappear in Admin Console

XMLWordPrintable

    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Description of problem:

         Loki logging alerts are not working log type audit.
      
         This can be observed when creating an alerting rule with an 'always firing' logging alert with tenantID: audit.This 'always firing' alert evaluates the expression:
      
      sum(rate({ log_type="audit" } |= "authorization.k8s.io/decision" |= "allow" [15s] )) > 0.01 
      
         - The alert does not appear in Administrator console under
           Observe > Alerting > Alerts
      
         - The alerting rule does not appear in the Administrator console under
           Observe > Alerting > Alerting Rules
         
         - The alert cannot be retrieved from the loki ruler API in the CLI for example:
            
      $ curl -s -k -H "Authorization: Bearer $(oc whoami -t)" "https://${LOKI_PUBLIC_URL}/api/logs/v1/application/prometheus/api/v1/alerts" | jq 
      {
        "status": "success",
        "data": {
          "alerts": []
        },
        "errorType": "",
        "error": ""
      }
      
         
      Tested logged in as kubeadmin user

      Version-Release number of selected component (if applicable):

          OCP 4.17.18, loki 6.2, logging 6.2

      How reproducible:

          Easily Reproducible

      Steps to Reproduce:

          1. Installed operators Loki 6.2, Logging 6.2, COO, 1.1.1
          2. Using the steps outlined in this article: https://access.redhat.com/articles/7105362, with some modifications. Created a serviceaccount, granted collect and write permissions to this serviceaccount, with the addition of collect and write permission for audit logs 
      
      ~~~
      $ oc adm policy add-cluster-role-to-user cluster-logging-write-audit-logs system:serviceaccount:openshift-logging:collector
      $ oc adm policy add-cluster-role-to-user collect-audit-logs system:serviceaccount:openshift-logging:collector
      ~~~
      
          3. The ClusterLogForwarder requires addition of audit to the pipelines:
      
      ~~~
      pipelines:
        - name: default-logstore
          inputRefs:
          - application
          - infrastructure
          - audit
          outputRefs:
          - default-lokistack
      ~~~
      
          4. Ensure that the label 
              openshift.io/log-alerting: 'true'
              has been added to the openshift-logging namespace
      
          5. An AlertingRule is created with an 'always firing' audit logs alert
      
      ~~~
      apiVersion: loki.grafana.com/v1
      kind: AlertingRule
      metadata:
        name: test-audit-alert
        namespace: openshift-logging
        labels: 
          openshift.io/log-alerting: 'true'
      spec:
        groups:
          - interval: 1m
            name: TestAuditalert
            rules:
              - alert: TestAuditHighAllowRate
                annotations:
                  description: testing1,2
                  summary: testing1,2
                expr: >
                  sum(rate({ log_type="audit" } |= "authorization.k8s.io/decision" |= "allow" [15s] )) > 0.01 
                for: 1m
                labels:
                  severity: critical
        tenantID: audit
      ~~~
      
          6. For comparison, the always firing test app in https://access.redhat.com/articles/7105362
      was deployed, and alerts can be viewed in the console.
      
          7. Checking the logs for loki ruler pods using the command
      
      ~~~
      $ oc logs -n openshift-logging logging-loki-ruler-0
      ~~~
          
      Indicates that the Alerting rule is being evaluated and executed:
      
      ~~~
      level=info ts=2025-06-04T10:01:25.716361696Z caller=compat.go:67 user=audit rule_name=TestAuditHighAllowRate rule_type=alerting query="(sum(rate({log_type=\"audit\"} |= \"authorization.k8s.io/decision\" |= \"allow\"[15s])) > 0.01)" query_hash=2447605017 msg="evaluating rule"
      level=info ts=2025-06-04T10:01:25.716847811Z caller=engine.go:263 component=ruler evaluation_mode=local org_id=audit msg="executing query" query="(sum(rate({log_type=\"audit\"} |= \"authorization.k8s.io/decision\" |= \"allow\"[15s])) > 0.01)" query_hash=2447605017 type=instant
      level=info msg="request timings" insight=true source=loki_ruler rule_name=TestAuditHighAllowRate rule_type=alerting total=0.033403566 total_bytes=2728132 query_hash=2447605017
      level=info ts=2025-06-04T10:01:25.750318662Z caller=metrics.go:237 component=ruler evaluation_mode=local org_id=audit latency=fast query="(sum(rate({log_type=\"audit\"} |= \"authorization.k8s.io/decision\" |= \"allow\"[15s])) > 0.01)" query_hash=2447605017 query_type=metric range_type=instant length=0s start_delta=34.974241ms end_delta=34.974371ms step=0s duration=33.403566ms status=200 limit=0 returned_lines=0 throughput=82MB total_bytes=2.7MB total_bytes_structured_metadata=13kB lines_per_second=44037 total_lines=1471 post_filter_lines=1429 total_entries=1 store_chunks_download_time=0s queue_time=0s splits=0 shards=0 query_referenced_structured_metadata=false pipeline_wrapper_filtered_lines=0 chunk_refs_fetch_time=33.036859ms cache_chunk_req=0 cache_chunk_hit=0 cache_chunk_bytes_stored=0 cache_chunk_bytes_fetched=0 cache_chunk_download_time=0s cache_index_req=0 cache_index_hit=0 cache_index_download_time=0s cache_stats_results_req=0 cache_stats_results_hit=0 cache_stats_results_download_time=0s cache_volume_results_req=0 cache_volume_results_hit=0 cache_volume_results_download_time=0s cache_result_req=0 cache_result_hit=0 cache_result_download_time=0s cache_result_query_length_served=0s cardinality_estimate=0 ingester_chunk_refs=0 ingester_chunk_downloaded=0 ingester_chunk_matches=3 ingester_requests=2 ingester_chunk_head_bytes=325kB ingester_chunk_compressed_bytes=393kB ingester_chunk_decompressed_bytes=2.4MB ingester_post_filter_lines=1429 congestion_control_latency=0s index_total_chunks=0 index_post_bloom_filter_chunks=0 index_bloom_filter_ratio=0.00 index_used_bloom_filters=false index_shard_resolver_duration=0s disable_pipeline_wrappers=false has_labelfilter_before_parser=false
      ~~~
      
      Pulling the alert from the loki ruler API does not return the alert.
      $ export LOKI_PUBLIC_URL=$(oc get route logging-loki -o jsonpath="{.spec.host}") 
      $ curl -s -k -H "Authorization: Bearer $(oc whoami -t)" "https://${LOKI_PUBLIC_URL}/api/logs/v1/application/prometheus/api/v1/alerts" | jq
      
      
      
          

      Actual results:

      • The alert does not appear in Administrator console (OR in the developer view) under
        Observe > Alerting > Alerts
      • The alerting rule does not appear in the Administrator console under
        Observe > Alerting > Alerting Rules
      • The alert cannot be retrieved from the loki ruler API in the CLI (for example, when there are no other firing alerts response is empty):

      $ curl -s -k -H "Authorization: Bearer $(oc whoami -t)" "https://${LOKI_PUBLIC_URL}/api/logs/v1/application/prometheus/api/v1/alerts" | jq
      {
      "status": "success",
      "data":

      { "alerts": [] }

      ,
      "errorType": "",
      "error": ""
      }

          

      Expected results:

          

      Additional info:

          

              Unassigned Unassigned
              rhn-support-ccostell Cormac Costello
              None
              Anping Li Anping Li
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated: