-
Epic
-
Resolution: Done
-
Major
-
None
-
Deploy Vector Collector as alternate Alpha offering
-
25
-
False
-
False
-
NEW
-
To Do
-
OBSDA-108 - Distribute an alternate Vector Log Collector
-
VERIFIED
-
0% To Do, 0% In Progress, 100% Done
-
undefined
Goals
- Logging with a "preview" Release of Vector as a collector
- Smooth upgrade path from the last GA version of fluentd-based logging to Vector if spec'd
- Clear documentation for migrating from existing fluentd-specific API features.
- Compatible API extended to handle the vector collector
- Internal API design docs on how handle this and future "component replacement" API extensions in a consistent way.
Non-Goals
TODO
Motivation
TODO
Alternatives
- fluent-bit
Acceptance Criteria
- Minimal support for output types: Kafka, Elasticsearch
- Certificate authorization support for the implemented output types
- Minimal normalization features: loglevel as LOG-1759, log source
- Availability in the "preview" channel
- Ability to switch deployment between fluentd and vector
Stretch Goals
- Support for output type: Cloudwatch, Loki
TODO
Documentation Considerations
TODO
Open Questions
Do we support both fluentd and vector as configurable options in a single version?
My gut reaction is "Hell No" but we need to think thru the ramifications.
The ideal is that we support automatic, no-touch upgrade; which implies:
- Vector logging supports the full range of API features that are not specifically related to fluentd with equivalent observable behavior.
- Most fluentd-specific "tuning" parameters do not affect correct behavior, only performance under fluentd. They can be safely ignored by vector.
- Any fluentd-specific configuration that does affect correctness (hopefully there is none) is translated via the full k8s 2-way API transformation trick into equivalent vector configuration.
No short-cuts! If we do it right, it will Just Work. If we get it wrong it will be worse than making the users migrate manually.
I believe this is a reasonable goal, but we need some research and design up front to commit to it.
The only reason to GA support for both collectors in a single version is so users (and we ourselves) can "flip the switch" between fluentd and vector in order to test and debug problems, and to ensure users can fall back to a working fluentd configuration if vector fails somehow.
The better solution is to "flip the switch" by upgrading/downgrading between two versions that each support a single collector.
That does mean up-front work so that the upgrade/downgrade process is smooth enough for this to be realistic, but it will pay off:
- It will be more work in the long-term to maintain a two-headed beast, probably for much longer than we expect or want.
- We will have to bite the bullet on migration anyway when we eventually deprecate and drop fluentd, and it won't be any easier then.
- Smooth upgrade/downgrade is, in itself, a valuable improvement to user experience and maintenance costs.
NOTE: Part of this epic is refactoring the codebase to make it possible to support multiple collectors in one version. This is important for code structure and clarity, maintenance, debugging, future new collectors, internal trials etc. However, we don't want to offer that to customers because of the maintenance and testing burden it implies.
Additional Notes
TODO - break down into stories & tasks.
- blocks
-
LOG-1795 GA vector as the primary collector for logging (Phase 1)
- Closed
-
OBSDA-116 Provide Tech Preview Support for Vector Collector with OpenShift Logging
- Closed
-
OBSDA-117 Provide Tech Preview Support for Vector with Kafka for OpenShift
- Closed
-
OBSDA-119 Provide Tech Preview Support for Vector with Elasticsearch for OpenShift
- Closed
- is blocked by
-
LOG-1423 Provide an alternative collector to the Kafka SRE team so that they can forward log messages to our internal Loki service
- New
-
LOG-1766 Define, document, and implement strategy for releasing preview features
- Closed
- is cloned by
-
LOG-1795 GA vector as the primary collector for logging (Phase 1)
- Closed