Uploaded image for project: 'OpenShift Logging'
  1. OpenShift Logging
  2. LOG-4241

Fluentd not releasing deleted file handles

XMLWordPrintable

    • False
    • None
    • False
    • NEW
    • NEW
    • Hide
      Prior to this change, fluentd was reported to not release rotated, deleted container log files when the cluster was under load. This change updates fluentd to v1.16.2 and ruby 3.1 which has demonstrated improvements but does not conclusively resolve the issue.
      Show
      Prior to this change, fluentd was reported to not release rotated, deleted container log files when the cluster was under load. This change updates fluentd to v1.16.2 and ruby 3.1 which has demonstrated improvements but does not conclusively resolve the issue.
    • Bug Fix
    • Proposed
    • Log Collection - Sprint 238, Log Collection - Sprint 240, Log Collection - Sprint 241, Log Collection - Sprint 242
    • Important
    • Customer Escalated

      Description of problem:

      Disk usage was consistently filling up, following one application pod around the environment. du did not show the disk usage, but lsof showed a large number of deleted files were still being locked by fluentd:

      fluentd   2712145                               root    8r      REG                8,4 828860702         480248013 /var/log/pods/example-5bbcf9c88-7gq5k_09859cdc-ed2d-4c42-9e1b-149322ce8cd8/istio-proxy/0.log (deleted)
      

      This bug is similar to the same issue described for vector in LOG-3949

      Version-Release number of selected component (if applicable):

      Logging 5.6.6

      How reproducible:

      Not able to reproduce. It happens when the logs from pods are deleted but fluentd continues with the log file opened, then, the space is not released. Possible to see in a impacted node:

      $ oc debug node/<node>
      # chroot /host
      # toolbox 
      # dnf install -y lsof
      # lsof -i |grep -i deleted > lsof.out
      # grep -c fluentd lsof.out
      11655

      Actual results:

      When checking the deleted files in the OS, but not released, a lot of files are coming from collector pod logs deleted by fluentd continues holding without releasing the deleted files

      Expected results:

      Fluentd should release deleted files.

      Workaround

      Restart the Logging collector pods for releasing the deleted files

      $ oc delete pods -l component=collector -n openshift-logging 

        1. 1stQuestion.txt
          3 kB
        2. 2ndQuestion.txt
          3 kB
        3. container_file_desc_logging5.7_max_3.88k.png
          container_file_desc_logging5.7_max_3.88k.png
          188 kB
        4. container_file_desc_logging5.8_max_3.2.png
          container_file_desc_logging5.8_max_3.2.png
          77 kB
        5. image-2023-08-03-14-27-22-710.png
          image-2023-08-03-14-27-22-710.png
          91 kB
        6. Log 5.6.9 fluentd file handles.png
          Log 5.6.9 fluentd file handles.png
          87 kB
        7. Logging5.8_in_11h.png
          Logging5.8_in_11h.png
          33 kB
        8. Logging5.8_in_11h-1.png
          Logging5.8_in_11h-1.png
          33 kB
        9. question_1A.PNG
          question_1A.PNG
          59 kB
        10. v1.16.2.png
          v1.16.2.png
          68 kB

            jcantril@redhat.com Jeffrey Cantrill
            rhn-support-ocasalsa Oscar Casal Sanchez
            Anping Li Anping Li
            Votes:
            2 Vote for this issue
            Watchers:
            15 Start watching this issue

              Created:
              Updated:
              Resolved: