Loading...

Type: Bug
Resolution: Done
Priority: Normal
Fix Version/s: Logging 5.5.4
Affects Version/s: Logging 5.5.z
Component/s: Log Collection
Labels:
- devel_ack+
- no-doc
- no-rn

Blocked:
False
Blocked Reason:
None
Ready:
False
Docs QE Status:
NEW
QE Status:
VERIFIED
Release Note Type:
Do Not Include (note: this means to exclude from release notes and errata)
Release Note Status:
Rejected

Sprint:
Log Collection - Sprint 226

SFDC Cases Links:
SFDC Cases Open:
SFDC Cases Counter:

CLONED from v5.6 fix: https://issues.redhat.com/browse/LOG-3093

-----------------------------------

Version of components:

Server Version: 4.11.0-0.nightly-2022-09-20-234850

Kubernetes Version: v1.24.0+3882f8f

cluster-logging.v5.6.0

Description of the problem:

When forwarding logs to Cloudwatch with Vector as collector, Vector's healthcheck fails with below error.

2022-09-22T09:01:30.654268Z ERROR vector::topology::builder: msg="Healthcheck: Failed Reason." error=DescribeLogGroups failed: InvalidParameterException: 1 validation error detected: Value '{{ group_name }}' at 'logGroupNamePrefix' failed to satisfy constraint: Member must satisfy regular expression pattern: [\.\-_/#A-Za-z0-9]+ component_kind="sink" component_type="aws_cloudwatch_logs" component_id=cw component_name=cw
x Health check for "cw" failed

Steps to reproduce the issue:

1 Deploy a OCP AWS cluster.

2 Create secret for forwarding logs to Cloudwatch.

export REGION=us-east-2

export ACCESS_KEY_ID=$(oc get secret aws-creds -n kube-system -o json | jq -r '.data.aws_access_key_id'|base64 -d)
export SECRET_ACCESS_KEY=$(oc get secret  aws-creds -n kube-system -o json |jq -r '.data.aws_secret_access_key'|base64 -d)

oc -n openshift-logging create secret generic cw-secret \
--from-literal=aws_access_key_id="${ACCESS_KEY_ID}" \
--from-literal=aws_secret_access_key="${SECRET_ACCESS_KEY}"

3 Create ClusterLogForwarder instance.

apiVersion: "logging.openshift.io/v1"
kind: ClusterLogForwarder
metadata:
  name: instance
  namespace: openshift-logging
spec:
  outputs:
   - name: cw
     type: cloudwatch
     cloudwatch:
       groupBy: logType
       region: us-east-2
     secret:
        name: cw-secret
  pipelines:
    - name: all-logs
      inputRefs:
        - infrastructure
        - audit
        - application
      outputRefs:
        - cw

4 Create ClusterLogging instance.

apiVersion: "logging.openshift.io/v1"
kind: "ClusterLogging"
metadata:
  name: "instance" 
  namespace: "openshift-logging"
spec:
  managementState: "Managed"  
  collection:
    logs:
      type: "vector"  
      vector: {}

5 Check that the collector pods are running and logs are being sent to Cloudwatch. Run vector validate from the collector pod.

$ oc rsh collector-q5d8f
Defaulted container "collector" out of: collector, logfilesmetricexporter
sh-4.4# vector validate /etc/vector/vector.toml 
Loaded with warnings ["/etc/vector/vector.toml"]
------------------------------------------------
~ Transform "route_container_logs._unmatched" has no consumers2022-09-22T09:21:13.890401Z  INFO vector::sources::kubernetes_logs: Obtained Kubernetes Node name to collect logs for (self). self_node_name="ip-10-0-131-185.us-east-2.compute.internal"
2022-09-22T09:21:13.904520Z  INFO vector::sources::kubernetes_logs: Excluding matching files. exclude_paths=["/var/log/pods/openshift-logging_collector-*/*/*.log", "/var/log/pods/openshift-logging_elasticsearch-*/*/*.log", "/var/log/pods/openshift-logging_kibana-*/*/*.log", "/var/log/pods/*/*/*.gz", "/var/log/pods/*/*/*.tmp"]
2022-09-22T09:21:13.953417Z  WARN aws_smithy_client::builder: Retries require a `sleep_impl`, but none was passed into the builder. No retries will occur with the current configuration. If this was intentional, you can suppress this message with `Client::set_sleep_impl(None). Otherwise, unless you have a good reason to use the low-level service client API, consider using the `aws-config` crate to load a shared config from the environment, and construct a fluent client from that. If you need to use the low-level service client API, then pass in a sleep implementation to make timeouts and retry work.
√ Component configuration
2022-09-22T09:21:13.954030Z  INFO send_operation{operation="DescribeLogGroups" service="cloudwatchlogs"}:provide_credentials{provider=default_chain}: aws_config::meta::credentials::chain: provider in chain did not provide credentials provider=Environment context=environment variable not set
2022-09-22T09:21:13.954106Z  INFO send_operation{operation="DescribeLogGroups" service="cloudwatchlogs"}:provide_credentials{provider=default_chain}: aws_config::meta::credentials::chain: provider in chain did not provide credentials provider=Profile context=No profiles were defined
2022-09-22T09:21:13.954650Z  INFO send_operation{operation="DescribeLogGroups" service="cloudwatchlogs"}:provide_credentials{provider=default_chain}:send_operation{operation="AssumeRoleWithWebIdentity" service="sts"}: aws_http::auth: provider returned CredentialsNotLoaded, ignoring
2022-09-22T09:21:14.071679Z  INFO send_operation{operation="DescribeLogGroups" service="cloudwatchlogs"}:provide_credentials{provider=default_chain}: aws_config::meta::credentials::chain: loaded credentials provider=WebIdentityToken
2022-09-22T09:21:14.163378Z ERROR vector::topology::builder: msg="Healthcheck: Failed Reason." error=DescribeLogGroups failed: InvalidParameterException: 1 validation error detected: Value '{{ group_name }}' at 'logGroupNamePrefix' failed to satisfy constraint: Member must satisfy regular expression pattern: [\.\-_/#A-Za-z0-9]+ component_kind="sink" component_type="aws_cloudwatch_logs" component_id=cw component_name=cw
x Health check for "cw" failed
2022-09-22T09:21:14.163521Z  INFO vector::topology::builder: Healthcheck: Passed.
√ Health check "prometheus_output"
sh-4.4#

Additional Notes:

The healthcheck failure doesn't affect logs being sent to Cloudwatch.

clones

LOG-3093 [Vector] Healthcheck fails when forwarding logs to Cloudwatch

Closed

links to

openshift/cluster-logging-operator#1696: [release-5.5] LOG-3175: Vector healthcheck fails for cloudwatch forwarding

mentioned on

Merge request - Updated 2 upstream sources

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates