Uploaded image for project: 'Network Observability'
  1. Network Observability
  2. NETOBSERV-2492

Loki compactor does not release disk space after compaction due to open file descriptors (deleted files remain in use)

    • None
    • False
    • Hide

      None

      Show
      None
    • None
    • Important
    • None
    • None
    • None
    • None
    • None
    • None

      Description of problem:

      The Loki compactor in the openshift-network-observability namespace fails to reclaim disk space after compaction and retention operations.
      Although compaction completes, deleted temporary files under /tmp/loki remain held open by the Loki process.
      This prevents the OS from freeing space and results in persistent 98 % disk usage until the pod is restarted.This behavior matches upstream Grafana Loki issue:
      https://github.com/grafana/loki/issues/19514     

      Version-Release number of selected component (if applicable):

      Loki operator 6.2.6
      Network-observability 1.9.2
      

      How reproducible:

          

      Steps to Reproduce:

      1.Deploy LokiStack with openshift-network-observability operator and default 10 Gi compactor PVC (/tmp/loki).
      2.Let ingestion and compaction run for several days.
      3.Observe PVC utilization:
        df -h /tmp/loki <<--- showing near-full usage.
      4.Check actual files:
        du -sch /tmp/loki/* <<--- showing only a few kilobytes. 
      5.Run the command to see deleted files still open:
        lsof -p <compactor PID> | grep DEL
      6.Delete/restart compactor pod → disk space is released temporarily.
      
      Workaround: Manual restart of the compactor pod reclaims space temporarily, but the leak recurs after subsequent compaction. 

      Actual results:

      ~~~
      sh-5.1$ df -h /tmp/loki
      Filesystem      Size  Used Avail Use% Mounted on
      /dev/sdc        9.8G  9.6G  236M  98% /tmp/loki
      ~~~
      ~~~
      $ du -sch /tmp/loki/*
      56K     compactor
      16K     lost+found
      72K     total
      ~~~
      Despite only ~72 KB of visible files, /tmp/loki shows 9.6 GB used (98 % utilization).lsof confirms multiple deleted files still held open by the Loki process:
      ~~~
      loki  24xxxx8  DEL-W  REG  /tmp/loki/compactor/index_20367/compactor-17xxxxxxxx
      loki  24xxxx8  DEL-W  REG  /tmp/loki/compactor/index_20393/compactor-17xxxxxxxx
      loki  24xxxx8  DEL-W  REG  /tmp/loki/compactor/index_20395/17xxxxxxxx
      ~~~
      Compactor logs show repeated write failures due to exhausted space:
      ~~~
      2025-11-02T06:17:29Z level=error caller=compactor.go:571 msg="failed to apply retention" err="write /tmp/loki/...: no space left on device"
      ~~~
      Disk space is freed only after restarting the loki-compactor-0 pod, but the issue reappears in the next compaction cycle.

      Expected results:

      After compaction, Loki should close file descriptors for deleted temporary files so the filesystem can reclaim space automatically.
      No manual intervention pod restarts should be required.    

      Additional info:

          

              ocp-docs-bot OCP DocsBot
              rhn-support-gakendre Gaurav Kendre
              None
              None
              None
              Mehul Modi Mehul Modi
              None
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated: