Uploaded image for project: 'OpenShift Core Networking'
  1. OpenShift Core Networking
  2. CORENET-3854

Increase timers for IC upgrades on AWS and GCP to 95 minutes

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Critical Critical
    • None
    • None
    • None
    • 3
    • False
    • None
    • False
    • ---
    • SDN Sprint 239
    • 0

      For interconnect upgrades - i.e when moving from OCP 4.13 to OCP 4.14 where IC is enabled, we do a 2 phase rollout of ovnkube-master and ovnkube-node pods in the openshift-ovn-kubernetes namespace. This is to ensure we have minimum disruption since major architectural components are being brought from control-plane down to the data-plane.

      Since its a two phase roll out with each phase taking taking approximately 10mins, we effectively double the time it takes for OVNK component to upgrade thereby increasing the timeout thresholds on AWS.

      See https://redhat-internal.slack.com/archives/C050MC61LVA/p1689768779938889 for some more details.

      See sample runs:

      https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/release-openshift-origin-installer-launch-aws-modern/1679589472833900544

      https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/release-openshift-origin-installer-launch-aws-modern/1679589451010936832

      https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/release-openshift-origin-installer-launch-aws-modern/1678480739743567872

      I have noticed this happening once on GCP:

      https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/release-openshift-origin-installer-launch-gcp-modern/1680563737225859072

      This has not happened on Azure which has 95mins allowance. So this card tracks the work to increase the timers on AWS/GCP. This was brought up in the TRT team sync that happened yesterday (July 19th 2023) and sdodson_jira has agreed to approve this under the condition that we bring it down back to the current values in release 4.15.

      SDN team is confident the time will drop back to normal for future upgrades going from 4.14 -> 4.15 and so on. This will be tracked via https://issues.redhat.com/browse/OTA-999 

            [CORENET-3854] Increase timers for IC upgrades on AWS and GCP to 95 minutes

            PR merged! closing card.

            Surya Seetharaman added a comment - PR merged! closing card.

            Anurag Saxena added a comment - Thanks. Updates were discussed on slack: https://redhat-internal.slack.com/archives/GQ0CU2623/p1690318379154209  

            Waiting on anusaxen to trigger BM/Vsphere upgrades on IC and tell us how long it takes so that I can include that in the PR..

            cc rravaiol@redhat.com 

            Surya Seetharaman added a comment - Waiting on anusaxen to trigger BM/Vsphere upgrades on IC and tell us how long it takes so that I can include that in the PR.. cc rravaiol@redhat.com  

              sseethar Surya Seetharaman
              sseethar Surya Seetharaman
              Anurag Saxena, Riccardo Ravaioli
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated:
                Resolved: