• Icon: Epic Epic
    • Resolution: Done-Errata
    • Icon: Critical Critical
    • Logging 5.9.0
    • None
    • Log Collection
    • None
    • Log Collector tuning
    • False
    • None
    • False
    • Green
    • NEW
    • Administer, API, Deploy, Instructions, Migrate
    • To Do
    • OBSDA-549 - Reliability and performance tuning for log collection
    • OBSDA-549Reliability and performance tuning for log collection
    • NEW
    • 0% To Do, 0% In Progress, 100% Done
    • Hide
      This feature introduces the capability to tune some output settings (e.g. compression, retry duration, max payloads) to match characteristics of the receiver. Additionally, this features adds a delivery mode to allow administrators to choose between throughput and log durability. For example, AtLeastOnce configures minimal disk buffering of collected logs so those logs can be delivered after collector restarts.
      Show
      This feature introduces the capability to tune some output settings (e.g. compression, retry duration, max payloads) to match characteristics of the receiver. Additionally, this features adds a delivery mode to allow administrators to choose between throughput and log durability. For example, AtLeastOnce configures minimal disk buffering of collected logs so those logs can be delivered after collector restarts.
    • Feature

      Goals

      Enhance the ClusterLogforwarder API to

      • Allow tuning of individual output to support the unique characteristics of the receiver
      • Reduce the possibility of log loss when the collector restarts
      • Define a simple way to choose between throughput and durability of logs

      Non-Goals

      • Exposing the entirety of output configuration options for the underlying collector implementation

      Motivation

      • Some customers require collected log messages to survive a collector restart to support their regulatory mandates
      • Some customers use outputs (e.g. Cloudwatch) that have hard limitations to the size of batches they can receive

      Alternatives

      Acceptance Criteria

      • Verify their is no regression in log throughput and durability when the log forwarder does not spec any tuning
      • Verify collected log messages are not lost when the output for a log forwarder is optimized for log durability

      Risk and Assumptions

      Documentation Considerations

      • API documentation to support the added fields
      • Usage documentation to explain the feature

      Open Questions

      • How does e2e acks compare to disk buffering

      Additional Notes

            [LOG-5026] Log Collector Output Tuning

            Since the problem described in this issue should be resolved in a recent advisory, it has been closed.

            For information on the advisory (Logging for Red Hat OpenShift - 5.9.0), and where to find the updated files, follow the link below.

            If the solution does not work for you, open a new bug report.
            https://access.redhat.com/errata/RHBA-2024:1591

            Errata Tool added a comment - Since the problem described in this issue should be resolved in a recent advisory, it has been closed. For information on the advisory (Logging for Red Hat OpenShift - 5.9.0), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2024:1591

            Anping Li added a comment -

            jcantril@redhat.com rhn-engineering-aconway Can you update enhancement doc. it provide different options to the CRD https://github.com/openshift/enhancements/blob/a530ea42ef1df755fa5f74e8301ffec6f2e2a231/enhancements/cluster-logging/performance-tuning.md.

            {
              "description": "Tuning parameters for the output.  Specifying these parameters will alter the characteristics of log forwarder which may be different from its behavior without the tuning.",
              "properties": {
                "compression": {
                  "description": "Compression causes data to be compressed before sending over the network. It is an error if the compression type is not supported by the  output.",
                  "enum": [
                    "''",
                    "gzip",
                    "none",
                    "snappy",
                    "zlib",
                    "zstd"
                  ],
                  "type": "string"
                },
                "delivery": {
                  "description": "Delivery mode for log forwarding. \n - AtLeastOnce (default): if the forwarder crashes or is re-started, any logs that were read before the crash but not sent to their destination will be re-read and re-sent. Note it is possible that some logs are duplicated in the event of a crash - log records are delivered at-least-once. - AtMostOnce: The forwarder makes no effort to recover logs lost during a crash. This mode may give better throughput, but could result in more log loss.",
                  "enum": [
                    "AtLeastOnce",
                    "AtMostOnce"
                  ],
                  "type": "string"
                },
                "maxRetryDuration": {
                  "description": "MaxRetryDuration is the maximum time to wait between retry attempts after a delivery failure.",
                  "format": "int64",
                  "type": "integer"
                },
                "maxWrite": {
                  "anyOf": [
                    {
                      "type": "integer"
                    },
                    {
                      "type": "string"
                    }
                  ],
                  "description": "MaxWrite limits the maximum payload in terms of bytes of a single \"send\" to the output.",
                  "pattern": "^(\\+|-)?(([0-9]+(\\.[0-9]*)?)|(\\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\\+|-)?(([0-9]+(\\.[0-9]*)?)|(\\.[0-9]+))))?$",
                  "x-kubernetes-int-or-string": true
                },
                "minRetryDuration": {
                  "description": "MinRetryDuration is the minimum time to wait between attempts to retry after delivery a failure.",
                  "format": "int64",
                  "type": "integer"
                }
              },
              "type": "object"
            }
            

            Anping Li added a comment - jcantril@redhat.com rhn-engineering-aconway Can you update enhancement doc. it provide different options to the CRD https://github.com/openshift/enhancements/blob/a530ea42ef1df755fa5f74e8301ffec6f2e2a231/enhancements/cluster-logging/performance-tuning.md . { "description" : "Tuning parameters for the output. Specifying these parameters will alter the characteristics of log forwarder which may be different from its behavior without the tuning." , "properties" : { "compression" : { "description" : "Compression causes data to be compressed before sending over the network. It is an error if the compression type is not supported by the output." , " enum " : [ "''" , "gzip" , "none" , "snappy" , "zlib" , "zstd" ], "type" : "string" }, "delivery" : { "description" : "Delivery mode for log forwarding. \n - AtLeastOnce ( default ): if the forwarder crashes or is re-started, any logs that were read before the crash but not sent to their destination will be re-read and re-sent. Note it is possible that some logs are duplicated in the event of a crash - log records are delivered at-least-once. - AtMostOnce: The forwarder makes no effort to recover logs lost during a crash. This mode may give better throughput, but could result in more log loss." , " enum " : [ "AtLeastOnce" , "AtMostOnce" ], "type" : "string" }, "maxRetryDuration" : { "description" : "MaxRetryDuration is the maximum time to wait between retry attempts after a delivery failure." , "format" : "int64" , "type" : "integer" }, "maxWrite" : { "anyOf" : [ { "type" : "integer" }, { "type" : "string" } ], "description" : "MaxWrite limits the maximum payload in terms of bytes of a single \" send\ " to the output." , "pattern" : "^(\\+|-)?(([0-9]+(\\.[0-9]*)?)|(\\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\\+|-)?(([0-9]+(\\.[0-9]*)?)|(\\.[0-9]+))))?$" , "x-kubernetes- int -or-string" : true }, "minRetryDuration" : { "description" : "MinRetryDuration is the minimum time to wait between attempts to retry after delivery a failure." , "format" : "int64" , "type" : "integer" } }, "type" : "object" }

              jcantril@redhat.com Jeffrey Cantrill
              jcantril@redhat.com Jeffrey Cantrill
              Anping Li Anping Li
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

                Created:
                Updated:
                Resolved: