Uploaded image for project: 'OpenShift Logging'
  1. OpenShift Logging
  2. LOG-5524

Vector collector pod crashes when kubernetes.label in structuredTypeKey contains "-"

XMLWordPrintable

    • False
    • None
    • False
    • NEW
    • NEW
    • Before this change, workloads that included labels with dashes would cause the collector to error while it was normalizing log entries. This fixes the issue by changing the config to use the correct syntax
    • Bug Fix
    • Log Collection - Sprint 255
    • Moderate

      Description of problem:

      Version-Release number of selected component (if applicable):

      Tested on RHOL 5.9.1 and RHOL 5.8.6

      How reproducible:

      NA

      Steps to Reproduce:

      1. Install CLO with vector collector
      apiVersion: logging.openshift.io/v1
      kind: ClusterLogging
      metadata:
        name: instance
        namespace: openshift-logging
      spec:
        collection:
          type: vector 
      ------ Output Omitted ------

            2. Create CLF like below (have a workload deployed in user namespace generating json logs and have required label as per structuredTypeKey):

      apiVersion: logging.openshift.io/v1
      kind: ClusterLogForwarder
      metadata:
       name: instance
       namespace: openshift-logging
      spec:
        outputs:
        - elasticsearch:
            structuredTypeKey: kubernetes.labels.region-name
            structuredTypeName: nologformat
          name: default
          type: elasticsearch
        pipelines:
        - inputRefs:
          - application
          name: application-logs
          outputRefs:
          - default
          parse: json 

       

             3. Check the status of collector pods:

      $ oc get pods -n openshift-logging | grep collector
      collector-2pwwq                                 0/1     CrashLoopBackOff   1 (6s ago)   8s
      collector-86xjq                                 0/1     CrashLoopBackOff   1 (6s ago)   8s
      collector-9nx48                                 0/1     CrashLoopBackOff   1 (6s ago)   8s
      collector-f62zl                                 0/1     CrashLoopBackOff   1 (6s ago)   8s
      collector-mrw9h                                 0/1     CrashLoopBackOff   1 (6s ago)   8s
      collector-vfxrw                                 0/1     CrashLoopBackOff   1 (6s ago)   8s 

       

      Actual results:

      Collector pod crashes with below error:

      $ oc logs -c collector collector-vfxrw -n openshift-logging
      Creating the directory used for persisting Vector state /var/lib/vector
      Starting Vector process...
      2024-05-10T16:52:39.358368Z  WARN vector::config::loading: Transform "route_container_logs._unmatched" has no consumers
      2024-05-10T16:52:39.489625Z ERROR vector::topology: Configuration error. error=Transform "default_add_es_index": 
      error[E701]: call to undefined variable
         ┌─ :17:34
         │
      17 │     val = .kubernetes.labels.region-name
         │                                  ^^^^^^
         │                                  │
         │                                  undefined variable
         │                                  did you mean "true"?
         │
         = see language documentation at https://vrl.dev
         = try your code in the VRL REPL, learn more at https://vrl.dev/examples
      

       

      Expected results:

      Vector collector should pick up  the label as it is defined in structuredTypeKey, configured in ClusterLogForwarder.

      Additional info:

      • When fluentd collector is used, then collector pods are initialized fine and index with required name (value of label kubernetes.labels.region-name) in Elasticsearch.
      • The application workload was generating json logs and had the label "kubernetes.labels.region-name" (on the pod).

            vparfono Vitalii Parfonov
            rhn-support-dgautam Dhruv Gautam
            Anping Li Anping Li
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated:
              Resolved: