Details
-
Feature
-
Resolution: Unresolved
-
Major
-
None
-
Logging 5.7, Logging 5.6, Logging 5.8
-
False
-
None
-
False
-
Not Selected
-
0
-
0%
-
0
Description
Proposed title of this feature request
Vector and fluentd comparative
What is the nature and description of the request?
Currently, Vector is GA as collector and Fluentd is deprecated. But, it's not clear how to do this migration, the impact of doing it (logs not compressed are again reprocessed), pros/cons, performance comparative, etc. This darkness on the change of the solution and how it works Fluentd and how it works Vector (in memory currently) leads to don't help in the adoption/transition/migration from Fluentd to Vector. Then, it should be so much helpful to have a comparative between both. Some points that usually are requested are: Vector and fluentd comparative - On features (present on the current Logging documentation) [1] - Simulate a load, the same for fluentd and Vector and: + share the results of memory and cpu usage for both + number of events dropped + number of event reads per second + time to ingest the load - Vector works in memory out of the box vs Fluentd works using disk buffering then: + reliability vs performance + usage of memory of Vector can be bigger when having backpressure when delivering the logs - Define how works by default the outputs + Retry (what's retried and not) + drop or block + Vector has 500 events per output in memory, fluentd uses buffering to disk of 8GB by default per output + Vector adaptative concurrency Also, indicate for the miration, that all the logs not compressed will be reprocessed by Vector that can lead to: - have duplicated logs in the moment of the migration - problems on the log store on disk and performance as consequence of re-reading and processing all old logs the collector - a peak of memory and cpu in Vector until all the old logs are processed (these logs can be several GB per node). This also could lead to a big impact. By instance, an example of the impact: + LogMaxSize: 100M, they are uncompressed 2 logs per pod + Node with 100 pods GB to be read inmediately: - Per pods 100MB x 2 x 100 = 20G for this node for pods - + journal logs - + audit logs from node
Why does the customer need this? (List the business requirements)
Guide on the migration from Fluentd to Vector and also facilitate the adoption of Vector explaining how it works and pros/cons
List any affected packages or components.
Collector: Vector
Additional information{}
It should be good with a tool/script/steps being able to estimate how many GB from Journald, Audit, infrastructure and Applications will be read in the moment of the migration to know better the impact and with that, schedule the best business time with low load to make the migration from Fluentd to Vector.