Log Collection - Sprint 236
On a cluster partially managed by app-sre via app-interface, the RHOBS team had deployed the cluster logging operator and set up this configuration for the cluster logging resource: https://gitlab.cee.redhat.com/service/app-interface/-/blob/master/resources/setup/cluster-logging/clusterlogging.yaml.
The cluster log forwarder was configured with the highlighted lines from the following file: https://gitlab.cee.redhat.com/service/app-interface/-/blob/master/resources/rhobs/logs/clusterlogforwarder.yaml.j2#L8-26
Upon the automatic upgrade of the cluster-logging operator from 5.6 to 5.7, it was noticed that all the pods in the logging collector daemonset were chrashlooping.
Here are some details about the crashlooping pods:
- They were running Vector, surprisingly, even though we had configured fluentd as collector.
- The following errors appeared in the events:
- lastTransitionTime: '2023-05-10T16:03:34Z'
message: pipeline must have a name