Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-61866

Cluster Network Operator should remain with Upgradeable=False condition during the CNI plugin migration

XMLWordPrintable

    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • Important
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Description of problem

      Although the cluster-network-operator prevents a 4.16 cluster from being upgraded if the CNI plugin is openshift-sdn, it allows a cluster upgrade to 4.17 to happen during the CNI plugin migration.

      Version-Release number of selected component (if applicable)

      Any 4.16.z.

      How reproducible

      Always

      Steps to Reproduce

      • We start with a 4.16 cluster using openshift-sdn.
      • We trigger the 4.17 upgrade.
      • 4.17 upgrade is "frozen" at Cluster Version Operator because Cluster Network Operator is still marked as Upgradeable=False (at this point, it would even be possible to cancel the upgrade, but let's not cancel it but let it remain just frozen).
      • We start the migration process. So far, so good.
      • At some point during the migration, it is required to set spec.networkType="OVNKubernetes" on network.config/cluster object, which sets spec.defaultNetwork.type="OVNKubernetes on the network.operator/cluster object. For example, in the offline migration procedure, it is done in the step 10 of the documentation (it is step 10 at the time I am reporting this).
      • As the only check performed by CNO to prevent the upgrade is whether spec.defaultNetwork.type=="OpenShiftSDN" on the network.operator/cluster object, at this point the CNO no longer sets itself as non-upgradeable, so the upgrade to 4.17 starts, even when the migration is ongoing.

      Actual results

      Cluster is upgrading to 4.17 at the same time than the migration is not finished. The cluster can even reach the point where CNO is upgraded to 4.17 when the migration is still in progress.

      At this point, the cluster can fully break and become very difficult or even impossible to recover.

      Expected results

      Cluster Network Operator should remain marking itself as non-upgradeable if there is any migration in progress (e.g. by checking if spec.migration is not null on network.operator/cluster object, which means there is some migration in progress).

      Additional info

      This is reproducible both with live migration and offline migration, as the cluster becomes upgradeable as soon as the OVNKubernetes network type is set (either manually during offline migration or automatically during live migration).

              bbennett@redhat.com Ben Bennett
              rhn-support-palonsor Pablo Alonso Rodriguez
              None
              None
              Anurag Saxena Anurag Saxena
              None
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated:
                Resolved: