Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-44813

Operator controller does not recover from a cached catalog error

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done-Errata
    • Icon: Undefined Undefined
    • None
    • 4.18.0
    • OLM
    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • Critical
    • None
    • None
    • Rejected
    • None
    • In Progress
    • Release Note Not Required
    • N/A
    • None
    • None
    • None
    • None

      Description of problem:

      If operator-controller's cluster catalog controller makes a request to Catalogd in order to cache a catalog, and it receives an error making that request, it caches that error. All subsequent retries will fetch that error from the cache and not actually attempt to populate the cache again until the catalog ref changes.    

      Version-Release number of selected component (if applicable):

      4.18.0-0.nightly-2024-11-20-085127   

      How reproducible:

      Not often or easy, depends on a race condition between operator-controller reading its cache of ClusterCatalog objects and Catalogd populating (or removing) the served catalog contents

      Steps to Reproduce:

      1.
      2.
      3.
          

      Actual results:

      Operator controller doesn't recover from failed cache attempt until catalog has a new resolved reference    

      Expected results:

      Operator controller should attempt to populate the cache again for an existing reference if the cached result is an error from a previous attempt.    

      Additional info:

          

              lmohanty@redhat.com Lalatendu Mohanty
              jlanford@redhat.com Joe Lanford
              None
              None
              Jian Zhang Jian Zhang
              None
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated:
                Resolved: