Uploaded image for project: 'OpenShift Monitoring'
  1. OpenShift Monitoring
  2. MON-1563

R&D: prometheus "WAL accumulation effect" prevention

XMLWordPrintable

    • Icon: Task Task
    • Resolution: Done
    • Icon: Undefined Undefined
    • None
    • None
    • None
    • False
    • False
    • NEW
    • NEW
    • Undefined
    • Monitoring - Sprint 199, Monitoring - Sprint 200, Monitoring - Sprint 201, Monitoring - Sprint 202

      While looking at the bugzilla and the subsequent support cases we recognize a pattern where the WAL directory keeps growing and not being flushed even for days. The end result is that Prometheus ends up in an endless "OOM restart loop" because it never gets a chance to flush the WAL to disk. 

      We should investigate if we can implement some sort of init-container that prevents this symptom, by i.e. deleting the WAL in worst case.

              dgrisonn@redhat.com Damien Grisonnet
              surbania Sergiusz Urbaniak (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated:
                Resolved: