Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-4168

Prometheus continuously restarts due to slow WAL replay

XMLWordPrintable

    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • Moderate
    • None
    • None
    • None
    • MON Sprint 228
    • 1
    • None
    • Enhancement
    • Hide
      Before the fix, some Prometheus in very large clusters were failing the startupProbe due to slower WAL replay.
      After the fix, we have extended the timeout of the startupProbe to accommodate for slower WAL replays.
      Show
      Before the fix, some Prometheus in very large clusters were failing the startupProbe due to slower WAL replay. After the fix, we have extended the timeout of the startupProbe to accommodate for slower WAL replays.
    • None
    • None
    • None
    • None

      Description of problem:

      Prometheus continuously restarts due to slow WAL replay

      Version-Release number of selected component (if applicable):

      openshift - 4.11.13

      How reproducible:

       

      Steps to Reproduce:

      1.
      2.
      3.
      

      Actual results:

       

      Expected results:

       

      Additional info:

       

              jmarcal@redhat.com Joao Marcal
              jmarcal@redhat.com Joao Marcal
              None
              None
              Junqi Zhao Junqi Zhao
              None
              Votes:
              1 Vote for this issue
              Watchers:
              12 Start watching this issue

                Created:
                Updated:
                Resolved: