Uploaded image for project: 'OpenShift Logging'
  1. OpenShift Logging
  2. LOG-4241

Fluentd not releasing deleted file handles

XMLWordPrintable

    • False
    • None
    • False
    • NEW
    • NEW
    • Hide
      Prior to this change, fluentd was reported to not release rotated, deleted container log files when the cluster was under load. This change updates fluentd to v1.16.2 and ruby 3.1 which has demonstrated improvements but does not conclusively resolve the issue.
      Show
      Prior to this change, fluentd was reported to not release rotated, deleted container log files when the cluster was under load. This change updates fluentd to v1.16.2 and ruby 3.1 which has demonstrated improvements but does not conclusively resolve the issue.
    • Bug Fix
    • Proposed
    • Log Collection - Sprint 238, Log Collection - Sprint 240, Log Collection - Sprint 241, Log Collection - Sprint 242
    • Important
    • Customer Escalated

      Description of problem:

      Disk usage was consistently filling up, following one application pod around the environment. du did not show the disk usage, but lsof showed a large number of deleted files were still being locked by fluentd:

      fluentd   2712145                               root    8r      REG                8,4 828860702         480248013 /var/log/pods/example-5bbcf9c88-7gq5k_09859cdc-ed2d-4c42-9e1b-149322ce8cd8/istio-proxy/0.log (deleted)
      

      This bug is similar to the same issue described for vector in LOG-3949

      Version-Release number of selected component (if applicable):

      Logging 5.6.6

      How reproducible:

      Not able to reproduce. It happens when the logs from pods are deleted but fluentd continues with the log file opened, then, the space is not released. Possible to see in a impacted node:

      $ oc debug node/<node>
      # chroot /host
      # toolbox 
      # dnf install -y lsof
      # lsof -i |grep -i deleted > lsof.out
      # grep -c fluentd lsof.out
      11655

      Actual results:

      When checking the deleted files in the OS, but not released, a lot of files are coming from collector pod logs deleted by fluentd continues holding without releasing the deleted files

      Expected results:

      Fluentd should release deleted files.

      Workaround

      Restart the Logging collector pods for releasing the deleted files

      $ oc delete pods -l component=collector -n openshift-logging 

        1. 1stQuestion.txt
          3 kB
          Nikita Jain
        2. 2ndQuestion.txt
          3 kB
          Nikita Jain
        3. container_file_desc_logging5.7_max_3.88k.png
          188 kB
          Anping Li
        4. container_file_desc_logging5.8_max_3.2.png
          77 kB
          Anping Li
        5. image-2023-08-03-14-27-22-710.png
          91 kB
          Jeffrey Cantrill
        6. Log 5.6.9 fluentd file handles.png
          87 kB
          Jeffrey Cantrill
        7. Logging5.8_in_11h.png
          33 kB
          Anping Li
        8. Logging5.8_in_11h-1.png
          33 kB
          Anping Li
        9. question_1A.PNG
          59 kB
          Nikita Jain
        10. v1.16.2.png
          68 kB
          Jeffrey Cantrill

              jcantril@redhat.com Jeffrey Cantrill
              rhn-support-ocasalsa Oscar Casal Sanchez
              Anping Li Anping Li
              Votes:
              2 Vote for this issue
              Watchers:
              15 Start watching this issue

                Created:
                Updated:
                Resolved: