Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-44018

Cluster-version operator should cache update advice through OSUS outages

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Normal Normal
    • None
    • 4.13, 4.12, 4.14, 4.15, 4.16, 4.17, 4.18
    • None

      Description of problem

      Update advice is append-only, with 4.y.z releases being added to channels regularly, and new update risks being declared occasionally. This makes caching a very safe behavior, and client-side caching in the CVO would reduce the disruption caused by OpenShift Update Service (OSUS) outages like OTA-1376.

      Version-Release number of selected component

      A single failed update-service retrieval currently clears the cache in 4.18. The code is pretty old, so I expect this behavior goes back through 4.12, our oldest release that's not yet end-of-life.

      How reproducible

      Every time.

      Steps to Reproduce

      1. Run a happy cluster with update advice.
      2. Break the update service, e.g. by using OTA-520 for a mock update service.
      3. Wait a few minutes for the cluster to notice the breakage.
      4. Check it's update recommendations, with oc adm upgrade or the new-in-4.18 oc adm upgrade recommend.

      Actual results

      No recommendations while the cluster is RetrievedUpdates=False.

      Expected results

      Preserving the cached recommendations while the cluster is RetrievedUpdates=False, at least for 24 hours. I'm not committed to a particular time, but 24h is much larger than any OSUS outage we've ever had, and still not so long that we'd expect much in the way of recommendation changes if the service had remained healthy.

              trking W. Trevor King
              trking W. Trevor King
              Dinesh Kumar S Dinesh Kumar S
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated: