Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-7275

RHCOS 9.2: Fails to upgrade from RHEL8 with layered packages due to rpmdb sqlite transition

XMLWordPrintable

    • Sprint 231 - Team Update&Remot, Sprint 232 - Update&Remoting
    • 2
    • Proposed
    • False
    • Hide

      None

      Show
      None

      Description of problem:

      We've begun payload testing of RHCOS 9.2 content. The TRT team has reviewed the initial results and we've identified a common failure in MCO updating nodes.

      syncRequiredMachineConfigPools: [timed out waiting for the condition, pool master has not progressed to latest configuration: controller version mismatch for 97-master-generated-kubelet expected 25ce5a734a70a1fe3785090db5e71ee174935f95 has ecc6bf3dc21eb33baf56692ba7d54f9a3b9be1d1: all 3 nodes are at latest configuration rendered-master-bfd3875cd5a47666a5cdd8d77bb737d9, retrying]}

      Version-Release number of selected component (if applicable):

      4.13.0 and CentOS Stream CoreOS 413.92.202302081904-0 (Plow)

      How reproducible:

      Across multiple upgrade tests, e2e-azure-ovn-upgrade, e2e-aws-sdn-upgrade, and e2e-gcp-ovn-rt-upgrade

      Steps to Reproduce:

      1. Upgrade from 4.12 to 4.13 w/ RHCOS 9.2 content
      2.
      3.
      

      Actual results:

      {Operator degraded (RequiredPoolsFailed): Unable to apply 4.13.0-0.ci.test-2023-02-09-034600-ci-op-tzvy6nnk-latest: error during syncRequiredMachineConfigPools: [timed out waiting for the condition, pool master has not progressed to latest configuration: controller version mismatch for 97-master-generated-kubelet expected 25ce5a734a70a1fe3785090db5e71ee174935f95 has ecc6bf3dc21eb33baf56692ba7d54f9a3b9be1d1: all 3 nodes are at latest configuration rendered-master-bfd3875cd5a47666a5cdd8d77bb737d9, retrying] Operator degraded (RequiredPoolsFailed): Unable to apply 4.13.0-0.ci.test-2023-02-09-034600-ci-op-tzvy6nnk-latest: error during syncRequiredMachineConfigPools: [timed out waiting for the condition, pool master has not progressed to latest configuration: controller version mismatch for 97-master-generated-kubelet expected 25ce5a734a70a1fe3785090db5e71ee174935f95 has ecc6bf3dc21eb33baf56692ba7d54f9a3b9be1d1: all 3 nodes are at latest configuration rendered-master-bfd3875cd5a47666a5cdd8d77bb737d9, retrying]}

      Expected results:

      successful updates of nodes

      Additional info:

      Tracked also in https://docs.google.com/document/d/1ObAO0jaRNBeTbHTl0_rNawe7E36P99ifJH0rCOlGpt4/edit#bookmark=id.3o4x2tyze00g

       

      Examples:

      1. https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/openshift-machine-config-operator-3485-ci-4.13-e2e-azure-ovn-upgrade/1623525273355948032
      2. https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/openshift-machine-config-operator-3485-nightly-4.13-e2e-aws-sdn-upgrade/1623525268607995904

        1. journal
          11.40 MB
        2. must-gather.tar.gz
          165.73 MB

            walters@redhat.com Colin Walters
            rhn-support-sdodson Scott Dodson
            Michael Nguyen Michael Nguyen
            Votes:
            0 Vote for this issue
            Watchers:
            18 Start watching this issue

              Created:
              Updated:
              Resolved: