-
Bug
-
Resolution: Won't Do
-
Critical
-
None
-
Logging 5.7.4
-
False
-
None
-
False
-
NEW
-
NEW
-
-
Bug Fix
-
Proposed
-
-
-
Log Collection - Sprint 240, Log Collection - Sprint 241
-
Important
-
Customer Escalated
Description of problem:
Disk usage was consistently filling up, following one application pod around the environment. du did not show the disk usage, but lsof showed a large number of deleted files were still being locked by fluentd:
fluentd 2712145 root 8r REG 8,4 828860702 480248013 /var/log/pods/example-5bbcf9c88-7gq5k_09859cdc-ed2d-4c42-9e1b-149322ce8cd8/istio-proxy/0.log (deleted)
This bug is similar to the same issue described for vector in LOG-3949
Version-Release number of selected component (if applicable):
Logging 5.6.6
How reproducible:
Not able to reproduce. It happens when the logs from pods are deleted but fluentd continues with the log file opened, then, the space is not released. Possible to see in a impacted node:
$ oc debug node/<node> # chroot /host # toolbox # dnf install -y lsof # lsof -i |grep -i deleted > lsof.out # grep -c fluentd lsof.out 11655
Actual results:
When checking the deleted files in the OS, but not released, a lot of files are coming from collector pod logs deleted by fluentd continues holding without releasing the deleted files
Expected results:
Fluentd should release deleted files.
Workaround
Restart the Logging collector pods for releasing the deleted files
$ oc delete pods -l component=collector -n openshift-logging
- clones
-
LOG-4241 Fluentd not releasing deleted file handles
- Closed
- mentioned on