Uploaded image for project: 'OpenShift Logging'
  1. OpenShift Logging
  2. LOG-2381

[Vector] [5.4] Collector pods fail to start with configuration error=unknown variant `internal_metrics`

    XMLWordPrintable

Details

    • Logging (Core) - Sprint 216

    Description

      Description of problem:

      When Cluster Logging instance is created with Vector as collector. The collector pods fail to start with configuration error.

      $ oc logs collector-rgqvr -c collector
      Mar 16 04:09:16.315  INFO vector::app: Log level is enabled. level="info"Mar 16 04:09:16.315  INFO vector::app: Loading configs. path=[("/etc/vector/vector.toml", Some(Toml))]Mar 16 04:09:16.317 ERROR vector::cli: Configuration error. error=unknown variant `internal_metrics`, expected one of `file`, `journald`, `kubernetes_logs`, `prometheus`, `prometheus_remote_write`, `prometheus_scrape` for key `sources.internal_metrics` at line 175 column 1 

       

      Version-Release number of selected component (if applicable):

      Cluster-logging.5.4.0-93

      OCP Server Version: 4.10.0-0.nightly-2022-03-15-182807

       

      How reproducible:

      Always

       

      Steps to reproduce the issue:

      1 Install the Cluster Logging and Elasticsearch 5.4 operators.

      2 Create a ClusterLogging instance with Vector preview.

      apiVersion: "logging.openshift.io/v1"
      kind: "ClusterLogging"
      metadata:
        name: "instance" 
        namespace: "openshift-logging"
        annotations:
          logging.openshift.io/preview-vector-collector: "enabled"
      spec:
        managementState: "Managed"  
        logStore:
          type: "elasticsearch"  
          retentionPolicy: 
            application:
              maxAge: 10h
            infra:
              maxAge: 10h
            audit:
              maxAge: 10h
          elasticsearch:
            nodeCount: 1 
            storage: {} 
            resources: 
                limits:
                  memory: "4Gi"
                requests:
                  memory: "1Gi"
            proxy: 
              resources:
                limits:
                  memory: 256Mi
                requests:
                  memory: 256Mi
            redundancyPolicy: "ZeroRedundancy"
        visualization:
          type: "kibana"  
          kibana:
            replicas: 1
        collection:
          logs:
            type: "vector"

      3 Check the status and logs of collector pods.

      oc get pods
      NAME                                            READY   STATUS             RESTARTS      AGE
      cluster-logging-operator-6884bc7f49-nv92h       1/1     Running            0             57m
      collector-ggpwl                                 1/2     CrashLoopBackOff   1 (9s ago)    13s
      collector-gmwhz                                 1/2     CrashLoopBackOff   1 (9s ago)    14s
      collector-pxjxm                                 1/2     CrashLoopBackOff   1 (10s ago)   13s
      collector-q4c2t                                 1/2     CrashLoopBackOff   1 (10s ago)   14s
      collector-r7nfh                                 1/2     CrashLoopBackOff   1 (7s ago)    12s
      collector-zvktw                                 1/2     CrashLoopBackOff   1 (11s ago)   14s
      elasticsearch-cdm-9kkyx6ta-1-678f5686f4-wfmmx   1/2     Running            0             17s
      kibana-6d757f6c4-dcnxc                          2/2     Running            0             13s
      
      oc logs collector-pxjxm -c collector
      Mar 16 04:59:46.453  INFO vector::app: Log level is enabled. level="info"
      Mar 16 04:59:46.454  INFO vector::app: Loading configs. path=[("/etc/vector/vector.toml", Some(Toml))]
      Mar 16 04:59:46.456 ERROR vector::cli: Configuration error. error=unknown variant `internal_metrics`, expected one of `file`, `journald`, `kubernetes_logs`, `prometheus`, `prometheus_remote_write`, `prometheus_scrape` for key `sources.internal_metrics` at line 175 column 1 

      4 Below is the generated Vector config.

      # Logs from containers (including openshift containers)
      [sources.raw_container_logs]
      type = "kubernetes_logs"
      auto_partial_merge = true
      exclude_paths_glob_patterns = ["/var/log/pods/openshift-logging_collector-*/*/*.log", "/var/log/pods/openshift-logging_elasticsearch-*/*/*.log", "/var/log/pods/openshift-logging_kibana-*/*/*.log"][sources.raw_journal_logs]
      type = "journald"[sources.internal_metrics]
      type = "internal_metrics"[transforms.container_logs]
      type = "remap"
      inputs = ["raw_container_logs"]
      source = '''
        level = "unknown"
        if match(.message,r'(Warning|WARN|W[0-9]+|level=warn|Value:warn|"level":"warn")'){
          level = "warn"
        } else if match(.message, r'Info|INFO|I[0-9]+|level=info|Value:info|"level":"info"'){
          level = "info"
        } else if match(.message, r'Error|ERROR|E[0-9]+|level=error|Value:error|"level":"error"'){
          level = "error"
        } else if match(.message, r'Debug|DEBUG|D[0-9]+|level=debug|Value:debug|"level":"debug"'){
          level = "debug"
        }
        .level = level  .pipeline_metadata.collector.name = "vector"
        .pipeline_metadata.collector.version = "0.14.1"
        ip4, err = get_env_var("NODE_IPV4")
        .pipeline_metadata.collector.ipaddr4 = ip4
        received, err = format_timestamp(now(),"%+")
        .pipeline_metadata.collector.received_at = received
        .pipeline_metadata.collector.error = err
       '''[transforms.journal_logs]
      type = "remap"
      inputs = ["raw_journal_logs"]
      source = '''
        level = "unknown"
        if match(.message,r'(Warning|WARN|W[0-9]+|level=warn|Value:warn|"level":"warn")'){
          level = "warn"
        } else if match(.message, r'Info|INFO|I[0-9]+|level=info|Value:info|"level":"info"'){
          level = "info"
        } else if match(.message, r'Error|ERROR|E[0-9]+|level=error|Value:error|"level":"error"'){
          level = "error"
        } else if match(.message, r'Debug|DEBUG|D[0-9]+|level=debug|Value:debug|"level":"debug"'){
          level = "debug"
        }
        .level = level  .pipeline_metadata.collector.name = "vector"
        .pipeline_metadata.collector.version = "0.14.1"
        ip4, err = get_env_var("NODE_IPV4")
        .pipeline_metadata.collector.ipaddr4 = ip4
        received, err = format_timestamp(now(),"%+")
        .pipeline_metadata.collector.received_at = received
        .pipeline_metadata.collector.error = err
       '''
      [transforms.route_container_logs]
      type = "route"
      inputs = ["container_logs"]
      route.app = '!((starts_with!(.kubernetes.pod_namespace,"kube")) || (starts_with!(.kubernetes.pod_namespace,"openshift")) || (.kubernetes.pod_namespace == "default"))'
      route.infra = '(starts_with!(.kubernetes.pod_namespace,"kube")) || (starts_with!(.kubernetes.pod_namespace,"openshift")) || (.kubernetes.pod_namespace == "default")'
      # Rename log stream to "application"
      [transforms.application]
      type = "remap"
      inputs = ["route_container_logs.app"]
      source = """
      .log_type = "application"
      """
      # Rename log stream to "infrastructure"
      [transforms.infrastructure]
      type = "remap"
      inputs = ["route_container_logs.infra","journal_logs"]
      source = """
      .log_type = "infrastructure"
      """
      [transforms.pipeline_0_]
      type = "remap"
      inputs = ["application","infrastructure"]
      source = """
      .
      """
      # Adding _id field
      [transforms.default_add_es_id]
      type = "remap"
      inputs = ["pipeline_0_"]
      source = """
      index = "default"
      if (.log_type == "application"){
        index = "app"
      }
      if (.log_type == "infrastructure"){
        index = "infra"
      }
      if (.log_type == "audit"){
        index = "audit"
      }
      ."write-index"=index+"-write"
      ._id = encode_base64(uuid_v4())
      """[transforms.default_dedot_and_flatten]
      type = "lua"
      inputs = ["default_add_es_id"]
      version = "2"
      hooks.process = "process"
      source = """
          function process(event, emit)
              if event.log.kubernetes == nil then
                  return
              end
              dedot(event.log.kubernetes.pod_labels)
              -- create "flat_labels" key
              event.log.kubernetes.flat_labels = {}
              i = 1
              -- flatten the labels
              for k,v in pairs(event.log.kubernetes.pod_labels) do
                event.log.kubernetes.flat_labels[i] = k.."="..v
                i=i+1
              end
              -- delete the "pod_labels" key
              event.log.kubernetes["pod_labels"] = nil
              emit(event)
          end    function dedot(map)
              if map == nil then
                  return
              end
              local new_map = {}
              local changed_keys = {}
              for k, v in pairs(map) do
                  local dedotted = string.gsub(k, "%.", "_")
                  if dedotted ~= k then
                      new_map[dedotted] = v
                      changed_keys[k] = true
                  end
              end
              for k in pairs(changed_keys) do
                  map[k] = nil
              end
              for k, v in pairs(new_map) do
                  map[k] = v
              end
          end
      """[sinks.default]
      type = "elasticsearch"
      inputs = ["default_dedot_and_flatten"]
      endpoint = "https://elasticsearch.openshift-logging.svc:9200"
      index = "{{ write-index }}"
      request.timeout_secs = 2147483648
      bulk_action = "create"
      id_key = "_id"
      # TLS Config
      [sinks.default.tls]
      key_file = "/var/run/ocp-collector/secrets/collector/tls.key"
      crt_file = "/var/run/ocp-collector/secrets/collector/tls.crt"
      ca_file = "/var/run/ocp-collector/secrets/collector/ca-bundle.crt"
      [sinks.prometheus_output]
      type = "prometheus_exporter"
      inputs = ["internal_metrics"]
      address = "0.0.0.0:24231"
      default_namespace = "collector" 

      5 Collector image used.

        collector:
          Container ID:   cri-o://180bb56fb4972922996748a833fec3e456098678062738e9c9e02862ca85612e
          Image:          registry.redhat.io/openshift-logging/vector-rhel8@sha256:dda974f1ac9dd666191a2c4724180f8a672ebf2947a37b493b7afe5d4e05768b
          Image ID:       registry.redhat.io/openshift-logging/vector-rhel8@sha256:c7368de78d829815e3fc24a35f809c928b41d755035e35fff867b29d9087a32b 

      6 Vector version:

      sh-4.4# vector --version
      vector 0.14.1 (x86_64-unknown-linux-gnu) 

       

       

      Attachments

        Activity

          People

            vimalkum@redhat.com Vimal Kumar
            rhn-support-ikanse Ishwar Kanse
            Ishwar Kanse Ishwar Kanse
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: