-
Bug
-
Resolution: Done
-
Blocker
-
Logging 5.8.0
Description of problem:
Setting outputs[].limit.maxRecordsPerSecond in a output, then monitor the doc count of application logs in the log store, the total count is always equal to $count-of-worker-nodes*maxRecordsPerSecond.
CLF:
apiVersion: logging.openshift.io/v1
kind: ClusterLogForwarder
metadata:
name: instance
namespace: openshift-logging
spec:
outputs:
- elasticsearch:
version: 6
limit:
maxRecordsPerSecond: 10
name: es-created-by-user
type: elasticsearch
url: http://elasticsearch-server-e2e-test-vector-es-namespace-glchn.apps.test.com:80
pipelines:
- inputRefs:
- application
labels:
logging-labels: test-labels
name: forward-to-external-es
outputRefs:
- es-created-by-user
Doc count in application logs:
Sun Oct 8 15:01:28 CST 2023 health status index uuid pri rep docs.count docs.deleted store.size pri.store.size yellow open app-write 0oppTrZtRLa_i1kK-WY31g 5 1 20718 0 9.5mb 9.5mb Sun Oct 8 15:02:29 CST 2023 health status index uuid pri rep docs.count docs.deleted store.size pri.store.size yellow open app-write 0oppTrZtRLa_i1kK-WY31g 5 1 22529 0 11.6mb 11.6mb Sun Oct 8 15:03:29 CST 2023 health status index uuid pri rep docs.count docs.deleted store.size pri.store.size yellow open app-write 0oppTrZtRLa_i1kK-WY31g 5 1 24226 0 13.8mb 13.8mb Sun Oct 8 15:04:30 CST 2023 health status index uuid pri rep docs.count docs.deleted store.size pri.store.size yellow open app-write 0oppTrZtRLa_i1kK-WY31g 5 1 25880 0 13.4mb 13.4mb Sun Oct 8 15:05:31 CST 2023 health status index uuid pri rep docs.count docs.deleted store.size pri.store.size yellow open app-write 0oppTrZtRLa_i1kK-WY31g 5 1 27572 0 13.1mb 13.1mb Sun Oct 8 15:06:31 CST 2023 health status index uuid pri rep docs.count docs.deleted store.size pri.store.size yellow open app-write 0oppTrZtRLa_i1kK-WY31g 5 1 29211 0 15.4mb 15.4mb Sun Oct 8 15:07:32 CST 2023 health status index uuid pri rep docs.count docs.deleted store.size pri.store.size yellow open app-write 0oppTrZtRLa_i1kK-WY31g 5 1 30908 0 14.7mb 14.7mb
The doc count increases about 1800 in one minute, but the expected value is 600.
In my cluster, there are 3 worker nodes, and on each worker nodes, there are some pods to generate app logs:
$ oc get node NAME STATUS ROLES AGE VERSION qitang-vcmdt-master-0.c.openshift-qe.internal Ready control-plane,master 7h6m v1.27.6+fd4d1f9 qitang-vcmdt-master-1.c.openshift-qe.internal Ready control-plane,master 7h5m v1.27.6+fd4d1f9 qitang-vcmdt-master-2.c.openshift-qe.internal Ready control-plane,master 7h5m v1.27.6+fd4d1f9 qitang-vcmdt-worker-a-8svps.c.openshift-qe.internal Ready worker 6h55m v1.27.6+fd4d1f9 qitang-vcmdt-worker-b-2zk5m.c.openshift-qe.internal Ready worker 6h55m v1.27.6+fd4d1f9 qitang-vcmdt-worker-c-24mzf.c.openshift-qe.internal Ready worker 6h55m v1.27.6+fd4d1f9 $ oc get pod -A -l run=centos-logtest -owide NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES e2e-test-vector-es-namespace-gkvrc logging-centos-logtest-nmxxc 1/1 Running 0 25m 10.131.0.215 qitang-vcmdt-worker-c-24mzf.c.openshift-qe.internal <none> <none> test-1 json-log-gghks 1/1 Running 0 20m 10.129.2.155 qitang-vcmdt-worker-a-8svps.c.openshift-qe.internal <none> <none> test-1 json-log-n8mf9 1/1 Running 0 20m 10.128.2.195 qitang-vcmdt-worker-b-2zk5m.c.openshift-qe.internal <none> <none> test-1 json-log-qsprd 1/1 Running 0 21m 10.131.0.220 qitang-vcmdt-worker-c-24mzf.c.openshift-qe.internal <none> <none> test-2 json-log-4sh4w 1/1 Running 0 104m 10.131.0.169 qitang-vcmdt-worker-c-24mzf.c.openshift-qe.internal <none> <none> test-3 json-log-pf7d5 1/1 Running 0 104m 10.128.2.173 qitang-vcmdt-worker-b-2zk5m.c.openshift-qe.internal <none> <none> test json-log-1-m5n7n 1/1 Running 0 104m 10.131.0.171 qitang-vcmdt-worker-c-24mzf.c.openshift-qe.internal <none> <none> test json-log-2-hfqw7 1/1 Running 0 104m 10.128.2.174 qitang-vcmdt-worker-b-2zk5m.c.openshift-qe.internal <none> <none> test json-log-3-w4w7t 1/1 Running 0 104m 10.131.0.172 qitang-vcmdt-worker-c-24mzf.c.openshift-qe.internal <none> <none>
Version-Release number of selected component (if applicable):
openshift-logging/cluster-logging-rhel9-operator/images/v5.8.0-177
openshift-logging/vector-rhel9/images/v0.28.1-30
How reproducible:
Always
Steps to Reproduce:
- Deploy some pods to generate logs
- Create CLF with above yaml file
- Monitor doc count in log store
Actual results:
The outputs[].limit.maxRecordsPerSecond is always exceeded, the actual value is $count-of-worker-nodes*maxRecordsPerSecond.
Expected results:
The outputs[].limit.maxRecordsPerSecond shouldn't be exceeded.