-
Bug
-
Resolution: Done
-
Normal
-
Logging 5.5.z
-
False
-
None
-
False
-
NEW
-
VERIFIED
-
Do Not Include (note: this means to exclude from release notes and errata)
-
Rejected
-
Log Collection - Sprint 226
CLONED from v5.6 fix: https://issues.redhat.com/browse/LOG-3093
-----------------------------------
Version of components:
Server Version: 4.11.0-0.nightly-2022-09-20-234850
Kubernetes Version: v1.24.0+3882f8f
cluster-logging.v5.6.0
Description of the problem:
When forwarding logs to Cloudwatch with Vector as collector, Vector's healthcheck fails with below error.
2022-09-22T09:01:30.654268Z ERROR vector::topology::builder: msg="Healthcheck: Failed Reason." error=DescribeLogGroups failed: InvalidParameterException: 1 validation error detected: Value '{{ group_name }}' at 'logGroupNamePrefix' failed to satisfy constraint: Member must satisfy regular expression pattern: [\.\-_/#A-Za-z0-9]+ component_kind="sink" component_type="aws_cloudwatch_logs" component_id=cw component_name=cw x Health check for "cw" failed
Steps to reproduce the issue:
1 Deploy a OCP AWS cluster.
2 Create secret for forwarding logs to Cloudwatch.
export REGION=us-east-2 export ACCESS_KEY_ID=$(oc get secret aws-creds -n kube-system -o json | jq -r '.data.aws_access_key_id'|base64 -d) export SECRET_ACCESS_KEY=$(oc get secret aws-creds -n kube-system -o json |jq -r '.data.aws_secret_access_key'|base64 -d) oc -n openshift-logging create secret generic cw-secret \ --from-literal=aws_access_key_id="${ACCESS_KEY_ID}" \ --from-literal=aws_secret_access_key="${SECRET_ACCESS_KEY}"
3 Create ClusterLogForwarder instance.
apiVersion: "logging.openshift.io/v1"
kind: ClusterLogForwarder
metadata:
name: instance
namespace: openshift-logging
spec:
outputs:
- name: cw
type: cloudwatch
cloudwatch:
groupBy: logType
region: us-east-2
secret:
name: cw-secret
pipelines:
- name: all-logs
inputRefs:
- infrastructure
- audit
- application
outputRefs:
- cw
4 Create ClusterLogging instance.
apiVersion: "logging.openshift.io/v1" kind: "ClusterLogging" metadata: name: "instance" namespace: "openshift-logging" spec: managementState: "Managed" collection: logs: type: "vector" vector: {}
5 Check that the collector pods are running and logs are being sent to Cloudwatch. Run vector validate from the collector pod.
$ oc rsh collector-q5d8f Defaulted container "collector" out of: collector, logfilesmetricexporter sh-4.4# vector validate /etc/vector/vector.toml Loaded with warnings ["/etc/vector/vector.toml"] ------------------------------------------------ ~ Transform "route_container_logs._unmatched" has no consumers2022-09-22T09:21:13.890401Z INFO vector::sources::kubernetes_logs: Obtained Kubernetes Node name to collect logs for (self). self_node_name="ip-10-0-131-185.us-east-2.compute.internal" 2022-09-22T09:21:13.904520Z INFO vector::sources::kubernetes_logs: Excluding matching files. exclude_paths=["/var/log/pods/openshift-logging_collector-*/*/*.log", "/var/log/pods/openshift-logging_elasticsearch-*/*/*.log", "/var/log/pods/openshift-logging_kibana-*/*/*.log", "/var/log/pods/*/*/*.gz", "/var/log/pods/*/*/*.tmp"] 2022-09-22T09:21:13.953417Z WARN aws_smithy_client::builder: Retries require a `sleep_impl`, but none was passed into the builder. No retries will occur with the current configuration. If this was intentional, you can suppress this message with `Client::set_sleep_impl(None). Otherwise, unless you have a good reason to use the low-level service client API, consider using the `aws-config` crate to load a shared config from the environment, and construct a fluent client from that. If you need to use the low-level service client API, then pass in a sleep implementation to make timeouts and retry work. √ Component configuration 2022-09-22T09:21:13.954030Z INFO send_operation{operation="DescribeLogGroups" service="cloudwatchlogs"}:provide_credentials{provider=default_chain}: aws_config::meta::credentials::chain: provider in chain did not provide credentials provider=Environment context=environment variable not set 2022-09-22T09:21:13.954106Z INFO send_operation{operation="DescribeLogGroups" service="cloudwatchlogs"}:provide_credentials{provider=default_chain}: aws_config::meta::credentials::chain: provider in chain did not provide credentials provider=Profile context=No profiles were defined 2022-09-22T09:21:13.954650Z INFO send_operation{operation="DescribeLogGroups" service="cloudwatchlogs"}:provide_credentials{provider=default_chain}:send_operation{operation="AssumeRoleWithWebIdentity" service="sts"}: aws_http::auth: provider returned CredentialsNotLoaded, ignoring 2022-09-22T09:21:14.071679Z INFO send_operation{operation="DescribeLogGroups" service="cloudwatchlogs"}:provide_credentials{provider=default_chain}: aws_config::meta::credentials::chain: loaded credentials provider=WebIdentityToken 2022-09-22T09:21:14.163378Z ERROR vector::topology::builder: msg="Healthcheck: Failed Reason." error=DescribeLogGroups failed: InvalidParameterException: 1 validation error detected: Value '{{ group_name }}' at 'logGroupNamePrefix' failed to satisfy constraint: Member must satisfy regular expression pattern: [\.\-_/#A-Za-z0-9]+ component_kind="sink" component_type="aws_cloudwatch_logs" component_id=cw component_name=cw x Health check for "cw" failed 2022-09-22T09:21:14.163521Z INFO vector::topology::builder: Healthcheck: Passed. √ Health check "prometheus_output" sh-4.4#
Additional Notes:
The healthcheck failure doesn't affect logs being sent to Cloudwatch.
- clones
-
LOG-3093 [Vector] Healthcheck fails when forwarding logs to Cloudwatch
- Closed
- links to
- mentioned on