-
Bug
-
Resolution: Won't Do
-
Normal
-
None
-
Logging 5.5.5
-
False
-
None
-
False
-
NEW
-
NEW
-
-
-
Log Collection - Sprint 229
-
Moderate
Description of problem:
As a consequence of bug LOG-3293 the process `log-file-metric-exporter` defined as container in the collector pod was able to use all the memory of the OCP nodes.
Even when not having the limits set was not the root cause of the problem, not having these limits set allowed the process to be able to consume most of the memory of the nodes until not more available and impact all the nodes starting all processes to OOM.
Able to see that the `log-file-metric-exporter` container has no limits set:
$ oc get ds collector -o yaml -n openshift-logging
...
args:
-c
/usr/local/bin/log-file-metric-exporter -verbosity=2 -dir=/var/log/pods
-http=:2112 -keyFile=/etc/collector/metrics/tls.key -crtFile=/etc/collector/metrics/tls.crt
command:
/bin/bash
image: registry.redhat.io/openshift-logging/log-file-metric-exporter-rhel8@sha256:bdf76fe782b47e938aba3258baac246ab81d2ece762b83ccf11a2e92a5f2746c
imagePullPolicy: IfNotPresent
name: logfilesmetricexporter
ports:
containerPort: 2112
name: logfile-metrics
protocol: TCP
resources: {} <---- not limits set
Version-Release number of selected component (if applicable):
Applicable RHOL versions
Actual results:
No limits defined for `log-file-metric-exporter`
Expected results:
Limits defined for `log-file-metric-exporter`