Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-4168

Prometheus continuously restarts due to slow WAL replay

    XMLWordPrintable

Details

    • Moderate
    • MON Sprint 228
    • 1
    • False
    • Hide

      None

      Show
      None
    • Hide
      Before the fix, some Prometheus in very large clusters were failing the startupProbe due to slower WAL replay.
      After the fix, we have extended the timeout of the startupProbe to accommodate for slower WAL replays.
      Show
      Before the fix, some Prometheus in very large clusters were failing the startupProbe due to slower WAL replay. After the fix, we have extended the timeout of the startupProbe to accommodate for slower WAL replays.
    • Enhancement

    Description

      Description of problem:

      Prometheus continuously restarts due to slow WAL replay

      Version-Release number of selected component (if applicable):

      openshift - 4.11.13

      How reproducible:

       

      Steps to Reproduce:

      1.
      2.
      3.
      

      Actual results:

       

      Expected results:

       

      Additional info:

       

      Attachments

        Issue Links

          Activity

            People

              jmarcal@redhat.com Joao Marcal
              jmarcal@redhat.com Joao Marcal
              Junqi Zhao Junqi Zhao
              Votes:
              1 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: