Uploaded image for project: 'OpenShift Logging'
  1. OpenShift Logging
  2. LOG-3098

[Vector] [release-5.5] Healthcheck fails when forwarding logs to Cloudwatch

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Duplicate
    • Icon: Normal Normal
    • None
    • Logging 5.5.2
    • Log Collection
    • False
    • None
    • False
    • NEW
    • NEW

      Version of components:

      Server Version: 4.11.0-0.nightly-2022-09-20-234850

      Kubernetes Version: v1.24.0+3882f8f

      cluster-logging.5.5.2

      Description of the problem:

      When forwarding logs to Cloudwatch with Vector as collector, Vector's healthcheck fails with below error.

      2022-09-22T09:01:30.654268Z ERROR vector::topology::builder: msg="Healthcheck: Failed Reason." error=DescribeLogGroups failed: InvalidParameterException: 1 validation error detected: Value '{{ group_name }}' at 'logGroupNamePrefix' failed to satisfy constraint: Member must satisfy regular expression pattern: [\.\-_/#A-Za-z0-9]+ component_kind="sink" component_type="aws_cloudwatch_logs" component_id=cw component_name=cw
      x Health check for "cw" failed 

      Steps to reproduce the issue:

      1 Deploy a OCP AWS cluster.

      2 Create secret for forwarding logs to Cloudwatch.

      export REGION=us-east-2
      
      export ACCESS_KEY_ID=$(oc get secret aws-creds -n kube-system -o json | jq -r '.data.aws_access_key_id'|base64 -d)
      export SECRET_ACCESS_KEY=$(oc get secret  aws-creds -n kube-system -o json |jq -r '.data.aws_secret_access_key'|base64 -d)
      
      oc -n openshift-logging create secret generic cw-secret \
      --from-literal=aws_access_key_id="${ACCESS_KEY_ID}" \
      --from-literal=aws_secret_access_key="${SECRET_ACCESS_KEY}" 

      3 Create ClusterLogForwarder instance.

      apiVersion: "logging.openshift.io/v1"
      kind: ClusterLogForwarder
      metadata:
        name: instance
        namespace: openshift-logging
      spec:
        outputs:
         - name: cw
           type: cloudwatch
           cloudwatch:
             groupBy: logType
             region: us-east-2
           secret:
              name: cw-secret
        pipelines:
          - name: all-logs
            inputRefs:
              - infrastructure
              - audit
              - application
            outputRefs:
              - cw 

      4 Create ClusterLogging instance.

      apiVersion: "logging.openshift.io/v1"
      kind: "ClusterLogging"
      metadata:
        name: "instance" 
        namespace: "openshift-logging"
      spec:
        managementState: "Managed"  
        collection:
          logs:
            type: "vector"  
            vector: {} 

      5 Check that the collector pods are running and logs are being sent to Cloudwatch. Run vector validate from the collector pod.

      oc rsh collector-w5rvb
      Defaulted container "collector" out of: collector, logfilesmetricexporter
      sh-4.4# vector validate /etc/vector/vector.toml 
      Loaded with warnings ["/etc/vector/vector.toml"]
      ------------------------------------------------
      ~ Transform "route_container_logs._unmatched" has no consumers2022-09-23T04:01:46.299741Z  INFO vector::sources::kubernetes_logs: Obtained Kubernetes Node name to collect logs for (self). self_node_name="ip-10-0-174-78.us-east-2.compute.internal"
      2022-09-23T04:01:46.317956Z  INFO vector::sources::kubernetes_logs: Excluding matching files. exclude_paths=["/var/log/pods/openshift-logging_collector-*/*/*.log", "/var/log/pods/openshift-logging_elasticsearch-*/*/*.log", "/var/log/pods/openshift-logging_kibana-*/*/*.log"]
      2022-09-23T04:01:46.357131Z  WARN aws_smithy_client::builder: Retries require a `sleep_impl`, but none was passed into the builder. No retries will occur with the current configuration. If this was intentional, you can suppress this message with `Client::set_sleep_impl(None). Otherwise, unless you have a good reason to use the low-level service client API, consider using the `aws-config` crate to load a shared config from the environment, and construct a fluent client from that. If you need to use the low-level service client API, then pass in a sleep implementation to make timeouts and retry work.
      √ Component configuration
      2022-09-23T04:01:46.358242Z  INFO vector::topology::builder: Healthcheck: Passed.
      √ Health check "prometheus_output"
      2022-09-23T04:01:46.382458Z ERROR vector::topology::builder: msg="Healthcheck: Failed Reason." error=DescribeLogGroups failed: InvalidParameterException: 1 validation error detected: Value '{{ group_name }}' at 'logGroupNamePrefix' failed to satisfy constraint: Member must satisfy regular expression pattern: [\.\-_/#A-Za-z0-9]+ component_kind="sink" component_type="aws_cloudwatch_logs" component_id=cw component_name=cw
      x Health check for "cw" failed
      

      Additional Notes:

      The healthcheck failure doesn't affect logs being sent to Cloudwatch.

       

              Unassigned Unassigned
              rhn-support-ikanse Ishwar Kanse
              Ishwar Kanse Ishwar Kanse
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated:
                Resolved: