Uploaded image for project: 'OpenShift Logging'
  1. OpenShift Logging
  2. LOG-3949

Vector not releasing deleted file handles

XMLWordPrintable

    • False
    • None
    • False
    • NEW
    • ASSIGNED
    • Hide
      Before this change, the collector relied upon the default config setting when reading container log lines. This resulted in the collector not efficiently reading rotated files an high volume clusters and holding onto deleted file handles for a long time. This change increases the number of bytes read allowing the collector to more efficiently process rotated files.
      Show
      Before this change, the collector relied upon the default config setting when reading container log lines. This resulted in the collector not efficiently reading rotated files an high volume clusters and holding onto deleted file handles for a long time. This change increases the number of bytes read allowing the collector to more efficiently process rotated files.
    • Bug Fix
    • Proposed
    • Log Collection - Sprint 235, Log Collection - Sprint 238, Log Collection - Sprint 239, Log Collection - Sprint 240, Log Collection - Sprint 241, Log Collection - Sprint 242, Log Collection - Sprint 243
    • Critical

      Description of problem:

      Disk usage was consistently filling up, following one application pod around the environment. du did not show the disk usage, but lsof showed a large number of deleted files were still being locked by Vector:

      vector 3430171 root 163r REG 8,4 105040954 1040189142 /var/log/pods/example-dev_example-cmd-linux-2_a9a87c45-ecad-49af-bdb7-3877273e5b95/example-cmd-linux-pod/0.log.20230403-205041 (deleted)
      

      Deleting the collector pod (or killing the vector process) releases the files and they fully delete, clearing the space.

      Version-Release number of selected component (if applicable):

      cluster-logging.5.5.4

      How reproducible:

      So far failed to reproduce. At this time the application which caused the issue is no longer running so not currently able to gather data from original cluster as the issue is active.

      Expected results:

      Vector should release deleted files.

      Additional info:

        1. deleted_fds_oldest_first.png
          45 kB
          Sergey Yedrikov
        2. deleted_fds_rotate_wait.png
          101 kB
          Sergey Yedrikov
        3. file-handle comparison.png
          34 kB
          Jeffrey Cantrill
        4. Logging5.7.5Metrics.png
          60 kB
          Anping Li
        5. screenshot-1.png
          62 kB
          Anping Li
        6. vector_file_deleted_given_up_individual.png
          164 kB
          Sergey Yedrikov
        7. vector_file_deleted_given_up_total_sum.png
          87 kB
          Sergey Yedrikov
        8. vector_open_files_5.8.0.png
          58 kB
          Anping Li
        9. vector_open_files_record1.png
          63 kB
          Anping Li
        10. vector_openfile_0926.png
          54 kB
          Anping Li

              syedriko_sub@redhat.com Sergey Yedrikov
              rhn-support-stwalter Steven Walter
              Anping Li Anping Li
              Votes:
              1 Vote for this issue
              Watchers:
              13 Start watching this issue

                Created:
                Updated:
                Resolved: