Uploaded image for project: 'OpenShift Logging'
  1. OpenShift Logging
  2. LOG-6759

Collectors in crashloopbackoff when defined an output that not used in the pipeline

XMLWordPrintable

    • False
    • None
    • False
    • NEW
    • NEW
    • Bug Fix
    • Log Collection - Sprint 267
    • Moderate

      Description of problem:

      When it's defined an output that never used, all the validations are passed and the collector configmap is generated by the Logging Operator, but the collectors pods are in CrashLoopBackOff:

      $ oc get pods -l app.kubernetes.io/component=collector 
      NAME              READY   STATUS             RESTARTS       AGE
      collector-796xd   0/1     CrashLoopBackOff   33 (85s ago)   144m
      collector-bkswb   0/1     CrashLoopBackOff   33 (86s ago)   144m
      collector-gjct7   0/1     CrashLoopBackOff   33 (56s ago)   144m
      collector-htq46   0/1     CrashLoopBackOff   33 (32s ago)   144m
      collector-rnw5w   0/1     CrashLoopBackOff   33 (72s ago)   144m
      

      With the error:

      $ oc logs collector-796xd 
      Creating the directory used for persisting Vector state /var/lib/vector/openshift-logging/collector
      Starting Vector process...
      2025-02-20T18:24:16.695224Z ERROR vector::cli: Configuration error. error=Transform "output_rsyslog_parse_encoding" has no inputs
      

      Version-Release number of selected component (if applicable):

      $ oc get csv|grep -i logging
      cluster-logging.v6.0.4                             Red Hat OpenShift Logging          6.0.4                   cluster-logging.v6.0.3              Succeeded
      

      How reproducible:

      Always

      Steps to Reproduce:

      Create a clusterLogForwarder custom resource with an output not used in the pipeline as:

      apiVersion: observability.openshift.io/v1
      kind: ClusterLogForwarder
      metadata:
        name: collector
        namespace: openshift-logging
      spec:
        managementState: Managed
        outputs:
          - lokiStack:
              authentication:
                token:
                  from: serviceAccount
              target:
                name: logging-loki
                namespace: openshift-logging
            name: default-lokistack
            tls:
              ca:
                configMapName: openshift-service-ca.crt
                key: service-ca.crt
            type: lokiStack
          - name: rsyslog
            syslog:
              facility: auth
              rfc: RFC3164
              severity: informational
              url: 'udp://syslog.example.com:514'
            type: syslog
        pipelines:
          - inputRefs:
              - audit
            name: syslog
            outputRefs:
              - default-lokistack
          - inputRefs:
              - infrastructure
              - application
            name: logging-loki
            outputRefs:
              - default-lokistack
          - inputRefs:
              - application
            name: container-logs
            outputRefs:
              - default-lokistack
        serviceAccount:
          name: collector
      

      Actual results:

      The clusterLogForwarder status shows all correct:

      $ oc get clf collector -o yaml
      [...]
      status:
        conditions:
        - lastTransitionTime: "2025-02-20T15:58:05Z"
          message: 'permitted to collect log types: [application audit infrastructure]'
          reason: ClusterRolesExist
          status: "True"
          type: observability.openshift.io/Authorized
        - lastTransitionTime: "2025-02-20T16:00:37Z"
          message: ""
          reason: ValidationSuccess
          status: "True"
          type: observability.openshift.io/Valid
        - lastTransitionTime: "2025-02-20T16:00:42Z"
          message: ""
          reason: ReconciliationComplete
          status: "True"
          type: Ready
        inputConditions:
        - lastTransitionTime: "2025-02-20T18:28:32Z"
          message: input "audit" is valid
          reason: ValidationSuccess
          status: "True"
          type: observability.openshift.io/ValidInput-audit
        - lastTransitionTime: "2025-02-20T18:28:32Z"
          message: input "infrastructure" is valid
          reason: ValidationSuccess
          status: "True"
          type: observability.openshift.io/ValidInput-infrastructure
        - lastTransitionTime: "2025-02-20T18:28:32Z"
          message: input "application" is valid
          reason: ValidationSuccess
          status: "True"
          type: observability.openshift.io/ValidInput-application
        outputConditions:
        - lastTransitionTime: "2025-02-20T15:58:05Z"
          message: output "rsyslog" is valid
          reason: ValidationSuccess
          status: "True"
          type: observability.openshift.io/ValidOutput-rsyslog
        - lastTransitionTime: "2025-02-20T18:28:32Z"
          message: output "default-lokistack-application" is valid
          reason: ValidationSuccess
          status: "True"
          type: observability.openshift.io/ValidOutput-default-lokistack-application
        - lastTransitionTime: "2025-02-20T18:28:32Z"
          message: output "default-lokistack-audit" is valid
          reason: ValidationSuccess
          status: "True"
          type: observability.openshift.io/ValidOutput-default-lokistack-audit
        - lastTransitionTime: "2025-02-20T18:28:32Z"
          message: output "default-lokistack-infrastructure" is valid
          reason: ValidationSuccess
          status: "True"
          type: observability.openshift.io/ValidOutput-default-lokistack-infrastructure
        pipelineConditions:
        - lastTransitionTime: "2025-02-20T15:58:05Z"
          message: pipeline "syslog" is valid
          reason: ValidationSuccess
          status: "True"
          type: observability.openshift.io/ValidPipeline-syslog
        - lastTransitionTime: "2025-02-20T15:58:05Z"
          message: pipeline "logging-loki" is valid
          reason: ValidationSuccess
          status: "True"
          type: observability.openshift.io/ValidPipeline-logging-loki
        - lastTransitionTime: "2025-02-20T18:28:32Z"
          message: pipeline "logging-loki-1" is valid
          reason: ValidationSuccess
          status: "True"
          type: observability.openshift.io/ValidPipeline-logging-loki-1
      

      And it's generated the "collector-config" configmap starting the collectors using the configuration generated going into "CrashLoopBackOff"

      $ oc get pods -l app.kubernetes.io/component=collector 
      NAME              READY   STATUS             RESTARTS       AGE
      collector-796xd   0/1     CrashLoopBackOff   33 (85s ago)   144m
      collector-bkswb   0/1     CrashLoopBackOff   33 (86s ago)   144m
      collector-gjct7   0/1     CrashLoopBackOff   33 (56s ago)   144m
      collector-htq46   0/1     CrashLoopBackOff   33 (32s ago)   144m
      collector-rnw5w   0/1     CrashLoopBackOff   33 (72s ago)   144m
      

      With the error:

      $ oc logs collector-796xd 
      Creating the directory used for persisting Vector state /var/lib/vector/openshift-logging/collector
      Starting Vector process...
      2025-02-20T18:24:16.695224Z ERROR vector::cli: Configuration error. error=Transform "output_rsyslog_parse_encoding" has no inputs
      

      Expected results:

      The clusterLogForwarder status shows that an error in the pipeline exists and avoid to create an invalid configuration.

      Additional info:

      This bug is similar to LOG-6758, but in LOG-6758, the collector-config configmap is not generated

              rh-ee-calee Calvin Lee
              rhn-support-ocasalsa Oscar Casal Sanchez
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated: