Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-37576

Upgrade to 4.14 stuck. Possibly due to enforcement of node-transit-switch-port-ifaddr

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Normal Normal
    • None
    • 4.13.z, 4.14.z
    • None
    • None
    • False
    • Hide

      None

      Show
      None

      Description of problem:

      When upgrading from 4.13.40 to 4.14.33, it is stuck the upgrade in a few operators.

      Version-Release number of selected component (if applicable):

      4.13.40 to 4.14.33

      How reproducible:

      Possibly..100%.

      Installed a public ARO cluster, with Vnet at 100.88.0.0/16 | Master subnet 100.88.0.0/17 | Worker subnet 100.88.128.0/17 .

      1. Cluster created ok, all COs healthy.  [OK]
      2. Example application, deployed.  [OK]
      3. OVN is installed, but k8s.ovn.org/node-transit-switch-port-ifaddr does not exist.  [OK]
      4. Upgrade to 4.14.33 gets stuck [FAILS] 
      5. OVN enforced "k8s.ovn.org/node-transit-switch-port-ifaddr": "{\"ipv4\":\"100.88.0.2/16\"}".
      6. Pod cloud-controller-manager in CrashloopBackoff
      7. OVN pods running ok.
      8. COs : etcd, network, authentation, kube-controller-manager Degraded.

       

      BEFORE the upgrade, while on 4.13.40

      oc get node $nodename -o json| jq -r '.metadata.annotations["k8s.ovn.org/node-transit-switch-port-ifaddr"]'
      null

      During upgrade I can already see the enforced changed :

      oc get node $nodename -o json| jq -r '.metadata.annotations["k8s.ovn.org/node-transit-switch-port-ifaddr"]'
      {"ipv4":"100.88.0.2/16"

      Expected results:

      To the OVN change to not be forcefully implemented. OR to the upgrade does not even start.

      Affected Platforms:

      Is it an

      1. internal RedHat testing failure

      If it is an internal RedHat testing failure:

      Reproducer here:

      https://docs.google.com/document/d/1JLJhWI0a4b1EEGlQqEg2ejwhs50LIn6_thx_kG-Itn8/edit 

       

      Additional Questions:
      1. If this is addded in Release notes as default change in 4.15, but it was actually also backported to 4.14, as per https://issues.redhat.com/browse/OCPBUGS-20261,  shouldnt this be addressed in the Release notes for 4.14 as well?

      2. What is the workaround plan for such scenarios?

            rhn-support-cfields Chris Fields
            rhn-support-hgomes Hevellyn Gomes
            Anurag Saxena Anurag Saxena
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated: