Uploaded image for project: 'OpenShift Logging'
  1. OpenShift Logging
  2. LOG-4785

Logs are not forwarded to default Loki using fluentd as collector unless buffer is manually cleaned up

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Obsolete
    • Icon: Undefined Undefined
    • None
    • Logging 5.6.z
    • Log Collection
    • False
    • Hide

      None

      Show
      None
    • False
    • NEW
    • NEW
    • Bug Fix

      Description of problem:
      In some condition( lokistack 1x.extra-small, Lots of 429 in collector pod logs ),  logs can not be forwarded to default Loki when fluentd is used as collector . 

      One needs to clean up the buffer manually under collector pods to send audit logs

      oc exec collector-mxs92 -- du -sh /var/lib/fluentd/default_loki_audit
      Defaulted container "collector" out of: collector, logfilesmetricexporter
      0 /var/lib/fluentd/default_loki_audit
      

      Version: Logging 5.6.13

      How reproducible: Always

      Steps to Reproduce:
      1) Deploy CLO and LO
      2) Forward logs to default logstore Loki (size: 1x-extra-small)
      3) Create CLF to forward Audit logs

      Actual results: Audit logs are not sent to Loki unless buffer is manually cleaned on the collector

      Expected results: Audit logs should be sent to default Loki

      Additional info:

      $oc get pod -o wide
      NAME                READY   STATUS    RESTARTS   AGE   IP              NODE                                        
      collector-67j2q      2/2     Running   0          34m   10.130.0.214   ip-10-0-215-221.us-east-2.compute.internal   
      collector-cfctt      2/2     Running   0          34m   10.129.2.223   ip-10-0-160-21.us-east-2.compute.internal   
      collector-fj95w      2/2     Running   0          34m   10.128.3.9     ip-10-0-151-35.us-east-2.compute.internal   
      collector-mt97h      2/2     Running   0          34m   10.129.1.129   ip-10-0-148-248.us-east-2.compute.internal   
      collector-mxs92      2/2     Running   0          27m   10.128.0.234   ip-10-0-181-95.us-east-2.compute.internal       
      collector-p2qxf      2/2     Running   0          34m   10.131.0.201   ip-10-0-214-122.us-east-2.compute.internal   
      ###After collector-mxs92 pod is restarted, logs can be sent out. The other pods still can not send logs in the following 16 minutes (43m-27m) ########## 
      logging-view-plugin-84bdd658db-shh9t       1/1     Running   0    51m   10.128.3.8     ip-10-0-151-35.us-east-2.compute.internal   
      lokistack-dev-compactor-0                  1/1     Running   0    29m   10.129.2.225   ip-10-0-160-21.us-east-2.compute.internal   
      lokistack-dev-distributor-b465647d9-mtlhb  1/1     Running   0    29m   10.129.2.224   ip-10-0-160-21.us-east-2.compute.internal   
      lokistack-dev-gateway-5d787897bb-dmwhb     2/2     Running   0    52m   10.131.0.196   ip-10-0-214-122.us-east-2.compute.internal   
      lokistack-dev-gateway-5d787897bb-gs2bq     2/2     Running   0    52m   10.128.3.5     ip-10-0-151-35.us-east-2.compute.internal   
      lokistack-dev-index-gateway-0              1/1     Running   0    29m   10.131.0.202   ip-10-0-214-122.us-east-2.compute.internal   
      lokistack-dev-ingester-0                   1/1     Running   0    29m   10.128.3.12    ip-10-0-151-35.us-east-2.compute.internal   
      lokistack-dev-querier-78f668d7bb-x5t4q     1/1     Running   0    29m   10.128.3.11    ip-10-0-151-35.us-east-2.compute.internal   
      lokistack-dev-query-frontend-6c445c6657-ggtlv  1/1   Running 0    29m   10.128.3.10    ip-10-0-151-35.us-east-2.compute.internal   
       
      $oc exec collector-67j2q – du -sh /var/lib/fluentd/default_loki_audit
      325M    /var/lib/fluentd/default_loki_audit
       
      $oc exec collector-cfctt – du -sh /var/lib/fluentd/default_loki_audit
      0    /var/lib/fluentd/default_loki_audit
       
      $c exec collector-fj95w – du -sh /var/lib/fluentd/default_loki_audit
      0    /var/lib/fluentd/default_loki_audit
      
      $oc exec collector-mt97h – du -sh /var/lib/fluentd/default_loki_audit
      282M    /var/lib/fluentd/default_loki_audit
       
      $oc exec collector-mxs92 – du -sh /var/lib/fluentd/default_loki_audit     ######This pod can sent out logs after restarted
      0    /var/lib/fluentd/default_loki_audit
       
      $c exec collector-p2qxf – du -sh /var/lib/fluentd/default_loki_audit
      0    /var/lib/fluentd/default_loki_audit
       
      #Lokistack only received logs from the restarted pod(collector-mxs92)
      $logcli -o raw --bearer-token=sha256~g5F8MHLSC2SY3q33Es06IIPQ7JX6E0BTo3AH78MRWKA --tls-skip-verify --addr=https://lokistack-dev-openshift-logging.apps.kbharti-1106a.qe.devcluster.openshift.com/api/logs/v1/audit query --limit=1 '{kubernetes_host="ip-10-0-181-95.us-east-2.compute.internal"}'
      https://lokistack-dev-openshift-logging.apps.kbharti-1106a.qe.devcluster.openshift.com/api/logs/v1/audit/loki/api/v1/query_range?direction=BACKWARD&end=1699275988440740074&limit=1&query=%7Bkubernetes_host%3D%22ip-10-0-181-95.us-east-2.compute.internal%22%7D&start=1699272388440740074
      Common labels: {fluentd_thread="flush_thread_0", kubernetes_host="ip-10-0-181-95.us-east-2.compute.internal", log_type="audit"}
      {
        "kind": "Event",
        "apiVersion": "audit.k8s.io/v1",
        "level": "Metadata",
        "auditID": "b7f3edcc-e0dc-4854-83c0-a19e3be205a3",
        "stage": "ResponseComplete",
        "requestURI": "/apis/helm.openshift.io/v1beta1/namespaces/openshift-logging/projecthelmchartrepositories",
        "verb": "list",
        "user": {
          "username": "kube:admin",
          "groups": [
            "system:cluster-admins",
            "system:authenticated"
          ],
          "extra":
      {       "scopes.authorization.openshift.io": [         "user:full"       ]     }
        },
        "sourceIPs": [
          "10.0.215.221"
        ],
        "objectRef":
      {     "resource": "projecthelmchartrepositories",     "namespace": "openshift-logging",     "apiGroup": "helm.openshift.io",     "apiVersion": "v1beta1"   }
      ,
        "responseStatus": {
          "metadata": {},
          "code": 200
        },
        "requestReceivedTimestamp": "2023-11-06T13:06:27.842855Z",
        "stageTimestamp": "2023-11-06T13:06:27.844478Z",
        "annotations":
      {     "authorization.k8s.io/decision": "allow",     "authorization.k8s.io/reason": "RBAC: allowed by ClusterRoleBinding \"cluster-admins\" of ClusterRole \"cluster-admin\" to Group \"system:cluster-admins\""   }
      ,
        "@timestamp": "2023-11-06T13:06:27.842855Z",
        "k8s_audit_level": "Metadata",
        "message": null,
        "hostname": "ip-10-0-181-95.us-east-2.compute.internal",
        "pipeline_metadata": {
          "collector":
      {       "ipaddr4": "10.0.181.95",       "inputname": "fluent-plugin-systemd",       "name": "fluentd",       "received_at": "2023-11-06T13:06:27.844869+00:00",       "version": "1.14.6 1.6.0"     }
        },
        "openshift":
      {     "sequence": 4806,     "cluster_id": "9e15fe36-affa-46e9-86ab-e37e284888de"   }
      ,
        "viaq_msg_id": "ZGEwMTdmMTEtNTNhMS00MDY5LTg1NWMtMWU1N2Q0MDA2MzM3",
        "log_type": "audit"
      }
       
      The other collectors pod stop sending logs to lokistack
      $logcli -o raw --bearer-token=sha256~g5F8MHLSC2SY3q33Es06IIPQ7JX6E0BTo3AH78MRWKA --tls-skip-verify --addr=https://lokistack-dev-openshift-logging.apps.kbharti-1106a.qe.devcluster.openshift.com/api/logs/v1/audit query --limit=1 '{kubernetes_host="ip-10-0-148-248.us-east-2.compute.internal"}'
      https://lokistack-dev-openshift-logging.apps.kbharti-1106a.qe.devcluster.openshift.com/api/logs/v1/audit/loki/api/v1/query_range?direction=BACKWARD&end=1699275988570655878&limit=1&query=%7Bkubernetes_host%3D%22ip-10-0-148-248.us-east-2.compute.internal%22%7D&start=1699272388570655878
       
      $logcli -o raw --bearer-token=sha256~g5F8MHLSC2SY3q33Es06IIPQ7JX6E0BTo3AH78MRWKA --tls-skip-verify --addr=https://lokistack-dev-openshift-logging.apps.kbharti-1106a.qe.devcluster.openshift.com/api/logs/v1/audit query --limit=1 '{kubernetes_host="ip-10-0-151-35.us-east-2.compute.internal"}'
      https://lokistack-dev-openshift-logging.apps.kbharti-1106a.qe.devcluster.openshift.com/api/logs/v1/audit/loki/api/v1/query_range?direction=BACKWARD&end=1699275988690677694&limit=1&query=%7Bkubernetes_host%3D%22ip-10-0-151-35.us-east-2.compute.internal%22%7D&start=1699272388690677694
       
      $logcli -o raw --bearer-token=sha256~g5F8MHLSC2SY3q33Es06IIPQ7JX6E0BTo3AH78MRWKA --tls-skip-verify --addr=https://lokistack-dev-openshift-logging.apps.kbharti-1106a.qe.devcluster.openshift.com/api/logs/v1/audit query --limit=1 '{kubernetes_host="ip-10-0-160-21.us-east-2.compute.internal"}'
      https://lokistack-dev-openshift-logging.apps.kbharti-1106a.qe.devcluster.openshift.com/api/logs/v1/audit/loki/api/v1/query_range?direction=BACKWARD&end=1699275988816143055&limit=1&query=%7Bkubernetes_host%3D%22ip-10-0-160-21.us-east-2.compute.internal%22%7D&start=1699272388816143055
       
      $logcli -o raw --bearer-token=sha256~g5F8MHLSC2SY3q33Es06IIPQ7JX6E0BTo3AH78MRWKA --tls-skip-verify --addr=https://lokistack-dev-openshift-logging.apps.kbharti-1106a.qe.devcluster.openshift.com/api/logs/v1/audit query --limit=1 '{kubernetes_host="ip-10-0-214-122.us-east-2.compute.internal"}'
      https://lokistack-dev-openshift-logging.apps.kbharti-1106a.qe.devcluster.openshift.com/api/logs/v1/audit/loki/api/v1/query_range?direction=BACKWARD&end=1699275988919164583&limit=1&query=%7Bkubernetes_host%3D%22ip-10-0-214-122.us-east-2.compute.internal%22%7D&start=1699272388919164583
       
      $logcli -o raw --bearer-token=sha256~g5F8MHLSC2SY3q33Es06IIPQ7JX6E0BTo3AH78MRWKA --tls-skip-verify --addr=https://lokistack-dev-openshift-logging.apps.kbharti-1106a.qe.devcluster.openshift.com/api/logs/v1/audit query --limit=1 '{kubernetes_host="ip-10-0-215-221.us-east-2.compute.internal"}'
      https://lokistack-dev-openshift-logging.apps.kbharti-1106a.qe.devcluster.openshift.com/api/logs/v1/audit/loki/api/v1/query_range?direction=BACKWARD&end=1699275989023872526&limit=1&query=%7Bkubernetes_host%3D%22ip-10-0-215-221.us-east-2.compute.internal%22%7D&start=1699272389023872526
      

       

        1. fluentd-conf.txt
          18 kB
          Kabir Bharti
        2. collector-logs.txt
          28 kB
          Kabir Bharti

              Unassigned Unassigned
              rhn-support-kbharti Kabir Bharti
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated:
                Resolved: