Uploaded image for project: 'OpenShift Logging'
  1. OpenShift Logging
  2. LOG-5998

Collector is ignoring audit logs larger than max_line_bytes

XMLWordPrintable

    • False
    • None
    • False
    • NEW
    • NEW
    • Before this change the collector would discard audit log messages that exceeded the configured threshold. This modifies the audit configuration thresholds for the maximum line size as well as the number of bytes read during a read cycle.
    • Bug Fix
    • Log Collection - Sprint 261
    • Moderate

      Description of problem:

      In the Vector collector is observed line errors where rejected to be read as exceeding the `max_line_size`:

      WARN file_source::buffer: Internal log [Found line that exceeds max_line_bytes; discarding.] is being suppressed to avoid flooding.
      WARN file_source::buffer: Internal log [Found line that exceeds max_line_bytes; discarding.] has been suppressed 1 times.
      WARN file_source::buffer: Found line that exceeds max_line_bytes; discarding. Internal_log_rate_limit=true

      It should be expected:

      1. that the log line throwing the error in Vector indicating the source file from where this line exceeding the `max_line_bytes` is coming from
        2. It could be also desired an alert that highlight this

      Version-Release number of selected component (if applicable):

      Vector collector
      All Logging versions from 5 where Vector GA

      How reproducible:

      Always

      Steps to Reproduce:

      1.  Enable Audit logs with AllRequestBodies as detailed in https://docs.openshift.com/container-platform/4.17/security/audit-log-policy-config.html#about-audit-log-profiles_audit-log-policy-config
      2. Enable to collect the audit logs for the collector
      3.  Review the logs from the collector checking errors with max_line_bytes
      4. Review for example the logs from openshift-apiserver, for downloading
      5. Run an script as below against the openshift-apiserver.log redirecting the output to a file. In this file, it will be the lines exceeding the max_line_bytes of 102400
      $ cat /tmp/find_lines.sh
      #!/bin/bash
      file="/tmp/openshift-apiserver.log"
      while read line
      do
              a=$(echo -n "$line" | wc -c)
              if [ "$a" -gt 102400 ]; then
                echo "${line}"
              fi
      done <"$file"
      
      $ /bin/bash /tmp/find_lines.sh > /tmp/openshift-apiserver_lines_exceeding

       Actual results:

      The error from Vector is as below where not possible to identify the source file from where the lines are discarded. The only visible is the log_type:

      $ for pod in $(oc get pods -l component=collector -o name); do oc logs $pod -c collector ; done |grep -c max_line_bytes
      608
       $ for pod in $(oc get pods -l component=collector -o name); do oc logs $pod -c collector ; done |grep  max_line_bytes |head -1
      2024-10-22T20:27:39.508209Z  WARN file_source::buffer: Found line that exceeds max_line_bytes; discarding. internal_log_rate_limit=true
      
      // An entire error log
      2024-10-22T20:40:20.402407Z  WARN sink{component_kind="sink" component_id=output_default_loki_audit component_type=loki}: vector::sinks::util::retries: Internal log [Retrying after error.] is being suppressed to avoid flooding.
      2024-10-22T20:40:30.096911Z  WARN file_source::buffer: Internal log [Found line that exceeds max_line_bytes; discarding.] has been suppressed 273 times.
      

      // Bytes of the first and last line exceeding the max_line_bytes

      $ head -1 /tmp/lines |wc -c
      248090
      $ tail  -1 /tmp/lines |wc -c
      1139398
      

      // A couple of lines exceeding the size is when listing the `images` and the `imagestreams`

      {"kind":"Event","apiVersion":"audit.k8s.io/v1","level":"RequestResponse","auditID":"3fa54595-ce18-4323-ac32-2c2c236673ea","stage":"ResponseComplete","requestURI":"/apis/image.openshift.io/v1/images?limit=500u0026resourceVersion=0","verb":"list","user":{"username":"system:apiserver","uid":"fa915fa4-1481-4d47-8b66-b174cbb89fcb","groups"...
      
      {"kind":"Event","apiVersion":"audit.k8s.io/v1","level":"RequestResponse","auditID":"b54ed4cd-56a7-4021-a521-1d6e
      a0010047","stage":"ResponseComplete","requestURI":"/apis/image.openshift.io/v1/imagestreams?limit=500u0026resour
      ceVersion=0","verb":"list","user":{"username":"system:apiserver","uid":"fa915fa4-1481-4d47-8b66-b174cbb89fcb","g
      roups":["system:masters"]},"sourceIPs":["::1"],"userAgent":"openshift-apiserver/v0.0.0 (linux/amd64) kubernetes/
      $Format","objectRef":{"resource":"imagestreams","apiGroup":"image.openshift.io","apiVersion":"v1"},"responseStat
      

      Expected results:

      Able to identify the source file from where the log lines discarded are coming from.

      Also, it could deserve to have an alert warning for when Vector is discarding lines to be read as exceeding the `max_line_size`.

      Additional info:

            jcantril@redhat.com Jeffrey Cantrill
            rhn-support-ocasalsa Oscar Casal Sanchez
            Votes:
            8 Vote for this issue
            Watchers:
            11 Start watching this issue

              Created:
              Updated: