Uploaded image for project: 'OpenShift Logging'
  1. OpenShift Logging
  2. LOG-1073

Implement flow control by backpressure

XMLWordPrintable

    • Icon: Epic Epic
    • Resolution: Unresolved
    • Icon: Undefined Undefined
    • None
    • None
    • Log Collection
    • None
    • Flow control backpressure
    • False
    • False
    • NEW
    • To Do
    • NEW
    • Undefined

      Goals

      As a cluster admin, I want to selectively enforce rate limits by applying back-pressure to slow down applications rather than dropping log data. I want to apply back-pressure to selected containers only (by namespace, labels etc.)

      LOG-884 introduced rate limits that are enforced by dropping data. This epic will introduce the notion of flow control policy with values drop and block. The flow control APIs will be extended (backwards compatible) to allow policy selection at the same granularity as rate limits. The default policy will remain drop.

      Non-Goals

      • Do not allow block policies to be applied to critical system logs (kubelet, infrastructure containers etc.) that could damage openshift's overall operationn. Need to define "critical".
      • End-to-end at-least-once delivery cannot be guaranteed unless the forwarding protocol also supports it. Blocking ensures less data loss but may not ensure zero data loss for all output types or configurations.

      Motivation

      LOG-884 defines rate limits for containers that are enforced by dropping log data if the containers exceed their limits.

      This epic provides a second alternative: enforce rate limits by applying back-pressure to slow down containers by blocking writes to stdout/stderr in order to enforce the rate limits. This may be preferable for applications that produce high-value logging data.

      Note this is a selective alternative to dropping data, it is not expected that all containers in any given cluster will want logging back-pressure. In particular system containers and processes should not be back-pressured.

      Alternatives

      Rate limits by dropping data are implemented by  LOG-884 and continue to be the default. Blocking flow control is targeted towards specific use cases where log output is high-value and slowing down containers is acceptable.

      Acceptance Criteria

      • Per-container rates are enforced by blocking container progress, data is not dropped even if containers try to log faster than their limit.
      • Policies (block, drop) and rates are applied correctly to containers selected by namespace, label or other supported mechanism.

      Risk and Assumptions

      Requires major changes to CRI-O to provide a feature similar to Docker "log drivers"

      • Send logs by socket, pipe, FIFO or other blocking transport, instead of (or as well as) writing logs to file.
      • Find alternate way to communicate log meta-data that is currently part of the log file name (for example named pipes)
      • Back-pressure containers fairly if logs are not being read quickly enough

      Some investigation and prototyping have already been done LOG-613

      Documentation Considerations

      None special

      Open Questions

      Will the block policy still be in demand once we have predictable drop-based rate limiting and improved overall throughput?

              Unassigned Unassigned
              rhn-engineering-aconway Alan Conway
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated: