Uploaded image for project: 'OpenShift Logging'
  1. OpenShift Logging
  2. LOG-4562

Complete: Collector to act as http server

XMLWordPrintable

    • http-server-complete
    • False
    • None
    • False
    • Green
    • NEW
    • Administer, API, Instructions
    • To Do
    • NEW
    • 75% To Do, 0% In Progress, 25% Done
    • If Release Note Needed, Set a Value
    • In Progress

      Goals

      This epic is cloned from LOG-3965 which was a preview version of this feature with limited functionality. Goal of this epic to to complete the feature for more use cases.

      Collector can be configured to listen for HTTP connections and receive logs as an HTTP server, also referred to as a "webhook"

      Logs from inbound connection can be normalized, filtered and forwarded using existing features of the collector.

      Framing

      User-specified "framing" indicates how a HTTP body should be separated into log records. Options:

      • raw: newline-delimited plain text log recrods (but see formats below)
      • json: body is a JSON array of records, which may be objects or strings. Objects are treated as records as-is, strings are JSON-unquoted and treated like raw text lines.
      • json.path.to.field: body is a JSON object containing a field identified by "path.to.field" which is an array of objects corresponding to log records. For example 'json.items' means the body is a JSON object with a top-level field named "items", which is an array of log records.

      Formats

      User-secified "format" indicates how to treat each "log record" in the body.

      • kubeAPIAudit: JSON-serialized k8s.io/apiserver/pkg/apis/audit/v1".EventList
      • viaq: JSON-serialized Viaq object, forwarded from another cluster logging instance
      • text: text log line (for "raw" framing) or JSON string containing a text line (JSON framing)
        • This is the full log text as the application wrote it, not the CRIO format line.

      OTEL is a possible future format, not required for first release.

      NOTE: "crio" is deliberately not a format. Assumption is that the logs must be scraped/collected by some "upstream" agent and that agent would deal with CRIO format before forwarding to us. CRIO is an awkward intermediary between the orignal log files and normalized container logs, it doesn't seem helpful to expose it further.

      Log type

      Some formats imply or include a log_type, in that case we use that log_type for the normalized log:

      • kubeAPIAudit implies "audit"
      • viaq includes a log_type

      Other formats don't have this so we introduce a new log type __ external for those logs. Effectively these are treated like "node logs" from an unknown external source.

      NOTES:

      • Careful to review existing code and fix assumptions broken by the new external type.
      • On balance it seems safer to introduce a new type than to change the interpretation of existing types to accomodate external logs that don't fit the existing patterns.
      • There is deliberately no flexibility to annotate or modify incoming logs, to keep the HTTP receiver simple. There is a new "pipeline.filter" concept in the CLF where we can add such features so they can be re-used across different receivers and inputs.

      Content-type

      We deliberately do not address the content-type of incoming HTTP requests. There is a relationship between content-type, framing and format but it is not a simple 1-1 relationship, so we can't  automatically guess framing and format from the content-type. Futhermore the user may be dealing with HTTP clients built into 3rd-party tools, and can't control content-type easily.

      Therefore we ignore content-type and require the user to specify framing and format exactly for us.

      Non Goals

      • Not a universal HTTP server. Aim to be flexible enough for mainstream HTTP logging use-cases without complex configuration.
      • No transformation of the incoming logs, that will be deferred to future pipeline.filter features.

      Motivation

      Collect logs from Kube-APIserver in hypershift configuration, where the log files are inaccessible so must use the webhook logging option.

      Feeds into a more general requirement for server-side features, begins an extentable "input" framework much like the existing "output" framework.

      Alternatives

      None good.

      Acceptance Criteria

      Implement the features set out in Goals:

      • framing: raw, json, json.path.to.field
      • formats: kubeAPIaudit, viaq, text.
      • log_type: pass thru and new external type

      Note: OTEL format may become a requirement. Asses OTEL plans when this epic is in development.

      Risk and Assumptions

      Acting as server is a new concept for the collector.

      Documentation Considerations

      Doc for input types will be similar in effort  to documenting output types

       

            Unassigned Unassigned
            rhn-engineering-aconway Alan Conway
            Anping Li Anping Li
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated: