Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-4823

release image version mismatch causing degradation during upgrades

XMLWordPrintable

    • Moderate
    • MCO Sprint 233
    • 1
    • False
    • Hide

      None

      Show
      None

      During upgrade tests, the MCO will become temporarily degraded with the following events showing up in the event log:

      Dec 13 18:16:07.380 E clusteroperator/machine-config condition/Degraded status/True reason/RequiredPoolsFailed changed: Unable to apply 4.12.0-0.nightly-multi-2022-12-13-144037: error during syncRequiredMachineConfigPools: [timed out waiting for the condition, pool master has not progressed to latest configuration: release image version mismatch for master in rendered-master-4b1ae6a96473b51f18e7f6e75410b354 expected: 4.12.0-0.nightly-multi-2022-12-13-144037 got: 4.12.0-0.nightly-multi-2022-12-13-140029, retrying]

       

      This seems to be occuring with some frequency as indicated by its prevalence in CI search

      $ curl -s 'https://search.ci.openshift.org/search?search=clusteroperator%2Fmachine-config+condition%2FDegraded+status%2FTrue+reason%2F.*release+image+version+mismatch&maxAge=48h&context=1&type=bug%2Bissue%2Bjunit&name=%5E%28periodic%7Crelease%29.*4%5C.1%5B1%2C2%5D.*&excludeName=&maxMatches=1&maxBytes=20971520&groupBy=job' | jq 'keys | length'
      50

       

      The MCO should not become degraded during an upgrade unless it cannot proceed with the upgrade. For this particular failure mode, I think it's a temporary failure caused by a race condition since the MCO eventually requeues and clears its degraded status.

            dkhater@redhat.com Dalia Khater
            zzlotnik@redhat.com Zack Zlotnik
            Rio Liu Rio Liu
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved: