Uploaded image for project: 'OpenShift Logging'
  1. OpenShift Logging
  2. LOG-6754

Vector collector not allowing files to delete

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Duplicate
    • Icon: Undefined Undefined
    • None
    • None
    • None
    • None
    • False
    • None
    • False
    • NEW
    • NEW
    • Bug Fix

      Description of problem:

      On ROSA 4.14.20 running cluster-logging v5.9.5, SRE observed master node disk pressure as a result of the vector process on-node not releasing files that were deleted.

      The issue surfaced as a disk pressure taint on a master node, with a second master node reporting nearly the same utilization (but slightly under the diskPressure threshold). When accessing the node to determine the component/directory consuming the most space, df -h indicated that the root disk is nearly full, while du -sh reported substantially less utilization across all directories.

      The reason for the discrepancy is that du does not report files which have been deleted on-disk, but whose handlers remain open. Running lsof | grep '(deleted)' lists these files, in addition to the process keeping them open.

      In both cases, a process called vector was keeping /var/log/kube-apiserver/audit*.log files open long after the file had been deleted on-disk. Deleting the corresponding collector pod allowed the files to be closed and resolved the disk pressure issue.

      Version-Release number of selected component (if applicable):

      cluster logging v5.9.5

      How reproducible:

      Unknown, potentially always

      Steps to Reproduce:

      1. ...

      Actual results:

      Node disk utilization unexpectedly high, causing a disk-pressure taint to be applied to the node and disrupting cluster operation.

      Expected results:

      Disk utilization does not increase (substantially) as a result of deploying cluster-logging

      Additional info:

              rhn-support-ocasalsa Oscar Casal Sanchez
              tnierman.openshift Trevor Nierman
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: