Uploaded image for project: 'OpenShift Logging'
  1. OpenShift Logging
  2. LOG-6759

Collectors in crashloopbackoff when defined an output that not used in the pipeline

XMLWordPrintable

    • False
    • Hide

      None

      Show
      None
    • False
    • NEW
    • VERIFIED
    • Hide
      Before this change the operator could deploy the collector with output configurations that are not referenced by any inputs. This update resolves that by adding a validation to fail the ClusterLogForwarder in this scenario so the operator will no longer deploy the collector.
      Show
      Before this change the operator could deploy the collector with output configurations that are not referenced by any inputs. This update resolves that by adding a validation to fail the ClusterLogForwarder in this scenario so the operator will no longer deploy the collector.
    • Bug Fix
    • Log Collection - Sprint 267, Log Collection - Sprint 268
    • Moderate

      Description of problem:

      When it's defined an output that never used, all the validations are passed and the collector configmap is generated by the Logging Operator, but the collectors pods are in CrashLoopBackOff:

      $ oc get pods -l app.kubernetes.io/component=collector 
      NAME              READY   STATUS             RESTARTS       AGE
      collector-796xd   0/1     CrashLoopBackOff   33 (85s ago)   144m
      collector-bkswb   0/1     CrashLoopBackOff   33 (86s ago)   144m
      collector-gjct7   0/1     CrashLoopBackOff   33 (56s ago)   144m
      collector-htq46   0/1     CrashLoopBackOff   33 (32s ago)   144m
      collector-rnw5w   0/1     CrashLoopBackOff   33 (72s ago)   144m
      

      With the error:

      $ oc logs collector-796xd 
      Creating the directory used for persisting Vector state /var/lib/vector/openshift-logging/collector
      Starting Vector process...
      2025-02-20T18:24:16.695224Z ERROR vector::cli: Configuration error. error=Transform "output_rsyslog_parse_encoding" has no inputs
      

      Version-Release number of selected component (if applicable):

      $ oc get csv|grep -i logging
      cluster-logging.v6.0.4                             Red Hat OpenShift Logging          6.0.4                   cluster-logging.v6.0.3              Succeeded
      

      How reproducible:

      Always

      Steps to Reproduce:

      Create a clusterLogForwarder custom resource with an output not used in the pipeline as:

      apiVersion: observability.openshift.io/v1
      kind: ClusterLogForwarder
      metadata:
        name: collector
        namespace: openshift-logging
      spec:
        managementState: Managed
        outputs:
          - lokiStack:
              authentication:
                token:
                  from: serviceAccount
              target:
                name: logging-loki
                namespace: openshift-logging
            name: default-lokistack
            tls:
              ca:
                configMapName: openshift-service-ca.crt
                key: service-ca.crt
            type: lokiStack
          - name: rsyslog
            syslog:
              facility: auth
              rfc: RFC3164
              severity: informational
              url: 'udp://syslog.example.com:514'
            type: syslog
        pipelines:
          - inputRefs:
              - audit
            name: syslog
            outputRefs:
              - default-lokistack
          - inputRefs:
              - infrastructure
              - application
            name: logging-loki
            outputRefs:
              - default-lokistack
          - inputRefs:
              - application
            name: container-logs
            outputRefs:
              - default-lokistack
        serviceAccount:
          name: collector
      

      Actual results:

      The clusterLogForwarder status shows all correct:

      $ oc get clf collector -o yaml
      [...]
      status:
        conditions:
        - lastTransitionTime: "2025-02-20T15:58:05Z"
          message: 'permitted to collect log types: [application audit infrastructure]'
          reason: ClusterRolesExist
          status: "True"
          type: observability.openshift.io/Authorized
        - lastTransitionTime: "2025-02-20T16:00:37Z"
          message: ""
          reason: ValidationSuccess
          status: "True"
          type: observability.openshift.io/Valid
        - lastTransitionTime: "2025-02-20T16:00:42Z"
          message: ""
          reason: ReconciliationComplete
          status: "True"
          type: Ready
        inputConditions:
        - lastTransitionTime: "2025-02-20T18:28:32Z"
          message: input "audit" is valid
          reason: ValidationSuccess
          status: "True"
          type: observability.openshift.io/ValidInput-audit
        - lastTransitionTime: "2025-02-20T18:28:32Z"
          message: input "infrastructure" is valid
          reason: ValidationSuccess
          status: "True"
          type: observability.openshift.io/ValidInput-infrastructure
        - lastTransitionTime: "2025-02-20T18:28:32Z"
          message: input "application" is valid
          reason: ValidationSuccess
          status: "True"
          type: observability.openshift.io/ValidInput-application
        outputConditions:
        - lastTransitionTime: "2025-02-20T15:58:05Z"
          message: output "rsyslog" is valid
          reason: ValidationSuccess
          status: "True"
          type: observability.openshift.io/ValidOutput-rsyslog
        - lastTransitionTime: "2025-02-20T18:28:32Z"
          message: output "default-lokistack-application" is valid
          reason: ValidationSuccess
          status: "True"
          type: observability.openshift.io/ValidOutput-default-lokistack-application
        - lastTransitionTime: "2025-02-20T18:28:32Z"
          message: output "default-lokistack-audit" is valid
          reason: ValidationSuccess
          status: "True"
          type: observability.openshift.io/ValidOutput-default-lokistack-audit
        - lastTransitionTime: "2025-02-20T18:28:32Z"
          message: output "default-lokistack-infrastructure" is valid
          reason: ValidationSuccess
          status: "True"
          type: observability.openshift.io/ValidOutput-default-lokistack-infrastructure
        pipelineConditions:
        - lastTransitionTime: "2025-02-20T15:58:05Z"
          message: pipeline "syslog" is valid
          reason: ValidationSuccess
          status: "True"
          type: observability.openshift.io/ValidPipeline-syslog
        - lastTransitionTime: "2025-02-20T15:58:05Z"
          message: pipeline "logging-loki" is valid
          reason: ValidationSuccess
          status: "True"
          type: observability.openshift.io/ValidPipeline-logging-loki
        - lastTransitionTime: "2025-02-20T18:28:32Z"
          message: pipeline "logging-loki-1" is valid
          reason: ValidationSuccess
          status: "True"
          type: observability.openshift.io/ValidPipeline-logging-loki-1
      

      And it's generated the "collector-config" configmap starting the collectors using the configuration generated going into "CrashLoopBackOff"

      $ oc get pods -l app.kubernetes.io/component=collector 
      NAME              READY   STATUS             RESTARTS       AGE
      collector-796xd   0/1     CrashLoopBackOff   33 (85s ago)   144m
      collector-bkswb   0/1     CrashLoopBackOff   33 (86s ago)   144m
      collector-gjct7   0/1     CrashLoopBackOff   33 (56s ago)   144m
      collector-htq46   0/1     CrashLoopBackOff   33 (32s ago)   144m
      collector-rnw5w   0/1     CrashLoopBackOff   33 (72s ago)   144m
      

      With the error:

      $ oc logs collector-796xd 
      Creating the directory used for persisting Vector state /var/lib/vector/openshift-logging/collector
      Starting Vector process...
      2025-02-20T18:24:16.695224Z ERROR vector::cli: Configuration error. error=Transform "output_rsyslog_parse_encoding" has no inputs
      

      Expected results:

      The clusterLogForwarder status shows that an error in the pipeline exists and avoid to create an invalid configuration.

      Additional info:

      This bug is similar to LOG-6758, but in LOG-6758, the collector-config configmap is not generated

              rh-ee-calee Calvin Lee
              rhn-support-ocasalsa Oscar Casal Sanchez
              Qiaoling Tang Qiaoling Tang
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

                Created:
                Updated:
                Resolved: