Uploaded image for project: 'OpenShift Logging'
  1. OpenShift Logging
  2. LOG-2448

Audit and journald logs cannot be viewed from LokiStack, when logs are forwarded with Vector as collector.

XMLWordPrintable

    • False
    • None
    • False
    • NEW
    • OBSDA-7 - Adopting Loki as an alternative to Elasticsearch to support more lightweight, easier to manage/operate storage scenarios
    • VERIFIED
    • Hide
      Before this update, an error in the gateway component enforcing tenancy for reading logs limited access to logs with a Kubernetes namespace causing "audit" and some "infrastructure" logs to be unreadable. With this update, the proxy correctly detects users with admin access and allows access to logs without a namespace for them.
      Show
      Before this update, an error in the gateway component enforcing tenancy for reading logs limited access to logs with a Kubernetes namespace causing "audit" and some "infrastructure" logs to be unreadable. With this update, the proxy correctly detects users with admin access and allows access to logs without a namespace for them.
    • Logging (LogExp) - Sprint 218

      Description of problem:

      Audit and journald logs cannot be viewed from LokiStack when logs are forwarded with Vector as collector. 

       

      Version-Release number of selected component (if applicable):

      Server Version: 4.10.6

      Kubernetes Version: v1.23.5+b0357ed

      Clusterlogging.v5.5.0

      elasticsearch-operator.v5.5.0

      Loki-operator.v0.0.1

       

      How reproducible:

      Always

       

      Steps to Reproduce:

      1 Create CatalogSource with upstream index quay.io/logging/logging-index:latest

      2 Install Cluster Logging, Elasticsearch and LokiStack operators.

      git clone git@gitlab.cee.redhat.com:aosqe/aosqe-tools.git
      oc create -f aosqe-tools/logging/log_template/vector/logging_with_loki_install.yaml

      3 Create S3 secret and bucket for LokiStack.

      LOKI_BUCKET_NAME=$(whoami)-aosqe-logging-loki
      NAMESPACE="openshift-logging"
      SECRETNAME="s3-secret"
      REGION="us-east-2"
      ENDPOINT="https://s3.${REGION}.amazonaws.com"
      ACCESS_KEY_ID=$(oc get secret aws-creds -n kube-system -o json | jq -r '.data.aws_access_key_id'|base64 -d)
      SECRET_ACCESS_KEY=$(oc get secret  aws-creds -n kube-system -o json |jq -r '.data.aws_secret_access_key'|base64 -d)
      
      
      Create S3 bucket:
      aws s3api create-bucket --bucket $LOKI_BUCKET_NAME --region $REGION --create-bucket-configuration LocationConstraint=$REGION
      
      Create S3 secret:
      oc -n "${NAMESPACE}" create secret generic ${SECRETNAME} \
        --from-literal=endpoint="${ENDPOINT}" \
        --from-literal=region="${REGION}" \
        --from-literal=bucketnames="${LOKI_BUCKET_NAME}" \
        --from-literal=access_key_id="${ACCESS_KEY_ID}" \
        --from-literal=access_key_secret="${SECRET_ACCESS_KEY}"

      4 Deploy LokiStack in the openshift-logging namesapce. and check the LokiStack status when the stack is deployed. 

      oc create -f aosqe-tools/logging/log_template/vector/lokistack_aws_simple.yaml
      
      oc get lokistacks.loki.grafana.com lokistack-instance -o yaml 

      5 Create the ClusterRole and ClusterRoleBinding which will allow the cluster to authenticate the user submitting the logs.

      oc create -f aosqe-tools/logging/log_template/vector/loki_role_bindings.yaml 

      6 Create a ClusterLogForwarder CR to forward logs to LokiStack.

      oc create -f aosqe-tools/logging/log_template/vector/clf_lokistack_gateway_http.yaml 

      7 Create a Cluster Logging instance.

      oc create -f aosqe-tools/logging/log_template/vector/cli_collector_only.yaml 

      8 Deploy the log generator app.

      oc new-project test
      oc new-app aosqe-tools/logging/log_gen/container_json_log_template.json 

      9 Extract and check the Vector config.

      oc extract secret/collector-config --confirm 

      10 Check logs in Lokistack

      bearer_token=$(oc whoami -t)
      
      lokistack_route=$(oc get route lokistack-instance -n openshift-logging -o json |jq '.spec.host' -r)
      
      logcli -o raw --bearer-token="${bearer_token}" --addr="http://${lokistack_route}/api/logs/v1/infrastructure" labels
      
      Check log gen app logs in test namespace:
      logcli -o raw --bearer-token="${bearer_token}" --addr="http://${lokistack_route}/api/logs/v1/application" query '{kubernetes_namespace_name="test"}'
      
      Check logs by log type:
      logcli -o raw --bearer-token="${bearer_token}" --addr="http://${lokistack_route}/api/logs/v1/infrastructure" query '{log_type="infrastructure"}'
      
      logcli -o raw --bearer-token="${bearer_token}" --addr="http://${lokistack_route}/api/logs/v1/application" query '{log_type="application"}'
      
      logcli -o raw --bearer-token="${bearer_token}" --addr="http://${lokistack_route}/api/logs/v1/audit" query '{log_type="audit"}' 

      Check that the audit and journald  logs queries returns empty result while the other log types have the correct log data.

      $ logcli -o raw --bearer-token="${bearer_token}" --addr="http://${lokistack_route}/api/logs/v1/audit" query '{log_type="audit"}'
      2022-04-05 16:52:10.228461 I | proto: duplicate proto type registered: purgeplan.DeletePlan
      2022-04-05 16:52:10.228523 I | proto: duplicate proto type registered: purgeplan.ChunksGroup
      2022-04-05 16:52:10.228528 I | proto: duplicate proto type registered: purgeplan.ChunkDetails
      2022-04-05 16:52:10.228531 I | proto: duplicate proto type registered: purgeplan.Interval
      2022-04-05 16:52:10.237093 I | proto: duplicate proto type registered: grpc.PutChunksRequest
      2022-04-05 16:52:10.237105 I | proto: duplicate proto type registered: grpc.GetChunksRequest
      2022-04-05 16:52:10.237110 I | proto: duplicate proto type registered: grpc.GetChunksResponse
      2022-04-05 16:52:10.237113 I | proto: duplicate proto type registered: grpc.Chunk
      2022-04-05 16:52:10.237116 I | proto: duplicate proto type registered: grpc.ChunkID
      2022-04-05 16:52:10.237120 I | proto: duplicate proto type registered: grpc.DeleteTableRequest
      2022-04-05 16:52:10.237123 I | proto: duplicate proto type registered: grpc.DescribeTableRequest
      2022-04-05 16:52:10.237126 I | proto: duplicate proto type registered: grpc.WriteBatch
      2022-04-05 16:52:10.237129 I | proto: duplicate proto type registered: grpc.WriteIndexRequest
      2022-04-05 16:52:10.237132 I | proto: duplicate proto type registered: grpc.DeleteIndexRequest
      2022-04-05 16:52:10.237135 I | proto: duplicate proto type registered: grpc.QueryIndexResponse
      2022-04-05 16:52:10.237138 I | proto: duplicate proto type registered: grpc.Row
      2022-04-05 16:52:10.237141 I | proto: duplicate proto type registered: grpc.IndexEntry
      2022-04-05 16:52:10.237145 I | proto: duplicate proto type registered: grpc.QueryIndexRequest
      2022-04-05 16:52:10.237148 I | proto: duplicate proto type registered: grpc.UpdateTableRequest
      2022-04-05 16:52:10.237151 I | proto: duplicate proto type registered: grpc.DescribeTableResponse
      2022-04-05 16:52:10.237154 I | proto: duplicate proto type registered: grpc.CreateTableRequest
      2022-04-05 16:52:10.237157 I | proto: duplicate proto type registered: grpc.TableDesc
      2022-04-05 16:52:10.237160 I | proto: duplicate proto type registered: grpc.TableDesc.TagsEntry
      2022-04-05 16:52:10.237163 I | proto: duplicate proto type registered: grpc.ListTablesResponse
      2022-04-05 16:52:10.237166 I | proto: duplicate proto type registered: grpc.Labels
      2022-04-05 16:52:10.237234 I | proto: duplicate proto type registered: storage.Entry
      2022-04-05 16:52:10.237239 I | proto: duplicate proto type registered: storage.ReadBatch
      http://lokistack-instance-openshift-logging.apps.ikanse-10.qe.devcluster.openshift.com/api/logs/v1/audit/loki/api/v1/query_range?direction=BACKWARD&end=1649157730237909100&limit=30&query=%7Blog_type%3D%22audit%22%7D&start=1649154130237909100
       
      logcli -o raw --bearer-token="${bearer_token}" --addr="http://${lokistack_route}/api/logs/v1/infrastructure" query '{kubernetes_namespace_name!~"openshift-.*"}' 
      2022-04-07 12:09:33.255527 I | proto: duplicate proto type registered: purgeplan.DeletePlan
      2022-04-07 12:09:33.255579 I | proto: duplicate proto type registered: purgeplan.ChunksGroup
      2022-04-07 12:09:33.255584 I | proto: duplicate proto type registered: purgeplan.ChunkDetails
      2022-04-07 12:09:33.255589 I | proto: duplicate proto type registered: purgeplan.Interval
      2022-04-07 12:09:33.263946 I | proto: duplicate proto type registered: grpc.PutChunksRequest
      2022-04-07 12:09:33.263959 I | proto: duplicate proto type registered: grpc.GetChunksRequest
      2022-04-07 12:09:33.263964 I | proto: duplicate proto type registered: grpc.GetChunksResponse
      2022-04-07 12:09:33.263968 I | proto: duplicate proto type registered: grpc.Chunk
      2022-04-07 12:09:33.263972 I | proto: duplicate proto type registered: grpc.ChunkID
      2022-04-07 12:09:33.263976 I | proto: duplicate proto type registered: grpc.DeleteTableRequest
      2022-04-07 12:09:33.263979 I | proto: duplicate proto type registered: grpc.DescribeTableRequest
      2022-04-07 12:09:33.263983 I | proto: duplicate proto type registered: grpc.WriteBatch
      2022-04-07 12:09:33.263986 I | proto: duplicate proto type registered: grpc.WriteIndexRequest
      2022-04-07 12:09:33.263989 I | proto: duplicate proto type registered: grpc.DeleteIndexRequest
      2022-04-07 12:09:33.263992 I | proto: duplicate proto type registered: grpc.QueryIndexResponse
      2022-04-07 12:09:33.263995 I | proto: duplicate proto type registered: grpc.Row
      2022-04-07 12:09:33.263998 I | proto: duplicate proto type registered: grpc.IndexEntry
      2022-04-07 12:09:33.264001 I | proto: duplicate proto type registered: grpc.QueryIndexRequest
      2022-04-07 12:09:33.264004 I | proto: duplicate proto type registered: grpc.UpdateTableRequest
      2022-04-07 12:09:33.264007 I | proto: duplicate proto type registered: grpc.DescribeTableResponse
      2022-04-07 12:09:33.264010 I | proto: duplicate proto type registered: grpc.CreateTableRequest
      2022-04-07 12:09:33.264013 I | proto: duplicate proto type registered: grpc.TableDesc
      2022-04-07 12:09:33.264016 I | proto: duplicate proto type registered: grpc.TableDesc.TagsEntry
      2022-04-07 12:09:33.264020 I | proto: duplicate proto type registered: grpc.ListTablesResponse
      2022-04-07 12:09:33.264025 I | proto: duplicate proto type registered: grpc.Labels
      2022-04-07 12:09:33.264114 I | proto: duplicate proto type registered: storage.Entry
      2022-04-07 12:09:33.264118 I | proto: duplicate proto type registered: storage.ReadBatch
      http://lokistack-instance-openshift-logging.apps.ikanse-10.qe.devcluster.openshift.com/api/logs/v1/infrastructure/loki/api/v1/query_range?direction=BACKWARD&end=1649313573264813910&limit=30&query=%7Bkubernetes_namespace_name%21~%22openshift-.%2A%22%7D&start=1649309973264813910 

      11 Check collector logs.

      $ oc logs collector-db428 -c collector | grep -iE 'error|audit'
      Apr 05 10:15:58.279  INFO vector::topology: Starting source. name="openshift_audit_logs"
      Apr 05 10:15:58.280  INFO vector::topology: Starting source. name="host_audit_logs"
      Apr 05 10:15:58.280  INFO vector::topology: Starting source. name="k8s_audit_logs"
      Apr 05 10:15:58.280  INFO vector::topology: Starting transform. name="send-audit-logs"
      Apr 05 10:15:58.281  INFO source{component_kind="source" component_name=openshift_audit_logs component_type=file}: vector::sources::file: Starting file server. include=["/var/log/oauth-apiserver.audit.log"] exclude=[]
      Apr 05 10:15:58.281  INFO source{component_kind="source" component_name=host_audit_logs component_type=file}: vector::sources::file: Starting file server. include=["/var/log/audit/audit.log"] exclude=[]
      Apr 05 10:15:58.281  INFO source{component_kind="source" component_name=k8s_audit_logs component_type=file}: vector::sources::file: Starting file server. include=["/var/log/kube-apiserver/audit.log"] exclude=[]
      Apr 05 10:15:58.281  INFO source{component_kind="source" component_name=openshift_audit_logs component_type=file}:file_server: file_source::checkpointer: Loaded checkpoint data.
      Apr 05 10:15:58.281  INFO source{component_kind="source" component_name=host_audit_logs component_type=file}:file_server: file_source::checkpointer: Loaded checkpoint data.
      Apr 05 10:15:58.281  INFO source{component_kind="source" component_name=k8s_audit_logs component_type=file}:file_server: file_source::checkpointer: Loaded checkpoint data.
      Apr 05 10:15:58.281  INFO vector::topology: Starting transform. name="audit"
      Apr 05 10:15:58.281  INFO vector::topology: Starting sink. name="loki_audit"
      Apr 05 10:15:58.282  INFO source{component_kind="source" component_name=k8s_audit_logs component_type=file}:file_server: vector::internal_events::file::source: Resuming to watch file. path=/var/log/kube-apiserver/audit.log file_position=75305359
      Apr 05 10:15:58.282  INFO source{component_kind="source" component_name=host_audit_logs component_type=file}:file_server: vector::internal_events::file::source: Found new file to watch. path=/var/log/audit/audit.log
      Apr 05 10:16:03.290 ERROR vector::topology::builder: msg="Healthcheck: Failed Reason." error=A non-successful status returned: 404 Not Found component_kind="sink" component_type="loki" component_name="loki_app"
      Apr 05 10:16:59.774  WARN source{component_kind="source" component_name=raw_container_logs component_type=kubernetes_logs}:file_server: vector::internal_events::file::source: Currently ignoring file too small to fingerprint. path=/var/log/pods/openshift-oauth-apiserver_apiserver-8f56bffd4-n6xtj_935e753e-131d-49dc-81bc-e56e4bf91971/fix-audit-permissions/0.log
      Apr 05 10:16:59.811  WARN source{component_kind="source" component_name=raw_container_logs component_type=kubernetes_logs}:file_server: vector::internal_events::file::source: Currently ignoring file too small to fingerprint. path=/var/log/pods/openshift-apiserver_apiserver-d5d7db8c8-5l4cm_a7b90d47-435e-4978-ba8c-3c438cc5a368/fix-audit-permissions/0.log
      Apr 05 10:19:25.282  INFO source{component_kind="source" component_name=k8s_audit_logs component_type=file}:file_server: vector::internal_events::file::source: Found new file to watch. path=/var/log/kube-apiserver/audit.log
      Apr 05 10:19:25.282  INFO source{component_kind="source" component_name=k8s_audit_logs component_type=file}:file_server: vector::internal_events::file::source: Stopped watching file. path=/var/log/kube-apiserver/audit.log
      Apr 05 11:01:43.064  INFO source{component_kind="source" component_name=k8s_audit_logs component_type=file}:file_server: vector::internal_events::file::source: Found new file to watch. path=/var/log/kube-apiserver/audit.log
      Apr 05 11:01:43.064  INFO source{component_kind="source" component_name=k8s_audit_logs component_type=file}:file_server: vector::internal_events::file::source: Stopped watching file. path=/var/log/kube-apiserver/audit.log
       

      12 Delete the LokiStack bucket, instance and resources. 

      oc delete clusterloggings.logging.openshift.io instance
      oc delete clusterlogforwarders.logging.openshift.io instance
      oc delete secrets $SECRETNAME
      oc delete lokistacks.loki.grafana.com lokistack-instance
      oc delete clusterrole lokistack-instance-tenant-logs
      oc delete clusterrolebindings lokistack-instance-tenant-logs
      aws s3 rb s3://$LOKI_BUCKET_NAME --region $REGION  --force 

      Additional details:

      Attached the generated vector.toml

       

       

       

       

              rojacob@redhat.com Robert Jacob
              rhn-support-ikanse Ishwar Kanse
              Ishwar Kanse Ishwar Kanse
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

                Created:
                Updated:
                Resolved: