Uploaded image for project: 'OpenShift Logging'
  1. OpenShift Logging
  2. LOG-1702

Entry out of order when forward logs to loki

    XMLWordPrintable

Details

    • False
    • False
    • NEW
    • OBSDA-7 - Adopting Loki as an alternative to Elasticsearch to support more lightweight, easier to manage/operate storage scenarios
    • NEW
    • Hide
      Before this update, some log records from separate sources were combined into a single Loki stream, causing "Bad Request ... entry out of order" errors and log loss. With this update, logs records from separate sources are never sent to the same stream, which resolves the issue.
      Show
      Before this update, some log records from separate sources were combined into a single Loki stream, causing "Bad Request ... entry out of order" errors and log loss. With this update, logs records from separate sources are never sent to the same stream, which resolves the issue.
    • Logging (Core) - Sprint 206

    Description

      Some records was not forward to Loki. huge amounts of messages in flushed log

      failed to POST http://loki-server.loki-aosqe.svc:3100/loki/api/v1/push (400 Bad Request entry with timestamp 2021-08-18 05:41:10.373584 +0000 UTC ignored, reason: 'entry out of order' for stream:

      {fluentd_thread="flush_thread_0", log_type="audit"},

      level=warn ts=2021-08-18T09:25:17.228731133Z caller=grpc_logging.go:38 duration=64.564µs method=/logproto.Pusher/Push err="rpc error: code = Code(400) desc = entry with timestamp 2021-08-18 09:25:14.681496 +0000 UTC ignored, reason: 'entry out of order' for stream:
      {fluentd_thread=\"flush_thread_0\", log_type=\"infrastructure\"},\nentry with timestamp 2021-08-18 09:25:14.682497 +0000 UTC ignored, reason: 'entry out of order' for stream: {fluentd_thread="flush_thread_0", log_type="infrastructure"}


      Step to reproduce:
      1. Deploy loki single process

      oc new-project loki-aosqe
      oc project loki-aosqe
      oc create sa loki-sa
      oc adm policy add-scc-to-user privileged system:serviceaccount:loki-aosqe:loki-sa

      wget https://raw.githubusercontent.com/grafana/loki/v2.3.0/cmd/loki/loki-local-config.yaml -O local-config.yaml
      oc create cm loki-config --from-file=loki-config.yaml=loki-config.yaml
      
      

       echo "kind: Deployment
      apiVersion: apps/v1
      metadata:
        name:  loki-server
        labels:
          provider: aosqe
          component: "loki"
          appname: loki-server
      spec:
        replicas: 1
        revisionHistoryLimit: 10
        selector:
          matchLabels:
            provider: aosqe
            component: "loki"
        strategy:
          type: Recreate
        template:
          metadata:
            labels:
              provider: aosqe
              component: "loki"
              appname: loki-server
          spec:
            serviceAccount: loki-sa
            serviceAccountName: loki-sa
            containers:
            - name: "loki"
              image: docker.io/grafana/loki:2.3.0
              imagePullPolicy: IfNotPresent
              securityContext:
                privileged: true
                procMount: Default
              ports:
              - containerPort: 3100
                name: tcp
                protocol: TCP
              volumeMounts:
              - mountPath: /etc/loki
                name: lokiconfig
                readOnly: true
            volumes:
            - configMap:
                defaultMode: 420
                name: loki-config
              name: lokiconfig
      ">loki-deployment.yaml
      


      oc create -f loki-deployment.yaml
      oc expose deployment loki-server

      2. forward audit log to loki
      echo "apiVersion: logging.openshift.io/v1
      kind: ClusterLogForwarder
      metadata:
        name: instance
        namespace: openshift-logging
      spec:
        outputs:
        outputs:
        - name: loki-server
          type: loki
          url: http://loki-server.loki-aosqe.svc:3100
        pipelines:
          - name: to-loki
            inputRefs:
            - application
            outputRefs:
            - loki-server">loki-clf.yaml
      oc create -f loki-clf.yaml
      


      3. Check the fluentd loki pod logs.


      Actual result:
      1) Logs of warn messages in fluentd pod. Please refer to the attached logs

      2) Loki report ntry out of order' for stream:
      
      

      level=warn ts=2021-08-18T06:06:51.553183641Z caller=grpc_logging.go:38 method=/logproto.Pusher/Push duration=90.326µs err="rpc error: code = Code(400) desc = entry with timestamp 2021-08-18 05:58:55.061936 +0000 UTC ignored, reason: 'entry out of order' for stream: {fluentd_thread="flush_thread_0", log_type="audit"}

      ,\nentry with timestamp 2021-08-18 05:58:55.061936 +0000 UTC ignored, reason: 'entry out of order' for stream:

      {fluentd_thread=\"flush_thread_0\", log_type=\"audit\"},\nentry with timestamp 2021-08-18 06:01:18.290229 +0000 UTC ignored, reason: 'entry out of order' for stream: {fluentd_thread="flush_thread_0", log_type="audit"}

      ,\nentry with timestamp 2021-08-18 06:01:18.290229 +0000 UTC ignored, reason: 'entry out of order' for stream:

      {fluentd_thread=\"flush_thread_0\", log_type=\"audit\"},\nentry with timestamp 2021-08-18 06:01:26.772308 +0000 UTC ignored, reason: 'entry out of order' for stream: {fluentd_thread="flush_thread_0", log_type="audit"}

      ,\nentry with timestamp 2021-08-18 06:01:26.772308 +0000 UTC ignored, reason: 'entry out of order' for stream:

      {fluentd_thread=\"flush_thread_0\", log_type=\"audit\"},\nentry with timestamp 2021-08-18 06:06:41.045394 +0000 UTC ignored, reason: 'entry out of order' for stream: {fluentd_thread="flush_thread_0", log_type="audit"}

      ,\nentry with timestamp 2021-08-18 06:06:41.045394 +0000 UTC ignored, reason: 'entry out of order' for stream:

      {fluentd_thread=\"flush_thread_0\", log_type=\"audit\"},\nentry with timestamp 2021-08-18 06:06:41.069429 +0000 UTC ignored, reason: 'entry out of order' for stream: {fluentd_thread="flush_thread_0", log_type="audit"}

      ,\nentry with timestamp 2021-08-18 06:06:41.069429 +0000 UTC ignored, reason: 'entry out of order' for stream:

      {fluentd_thread=\"flush_thread_0\", log_type=\"audit\"}

      ,\ntotal ignored: 14 out of 38" msg="gRPC\n"

      
      

       

      Expected result:
      The audit logs can be sent to Loki

      Attachments

        Activity

          People

            jcantril@redhat.com Jeffrey Cantrill
            rhn-support-anli Anping Li
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: