Uploaded image for project: 'OpenShift Cloud'
  1. OpenShift Cloud
  2. OCPCLOUD-2136

Update autoscaling annotations to accommodate upstream keys

    • Update autoscaling annotations
    • Upstream
    • False
    • None
    • False
    • Not Selected
    • To Do
    • OCPSTRAT-330 - [Upstream] OpenShift AutoScaler TechDebt (Phase 3)
    • OCPSTRAT-330[Upstream] OpenShift AutoScaler TechDebt (Phase 3)
    • 21% To Do, 5% In Progress, 74% Done

      Epic Goal

      • Update the scale from zero autoscaling annotations on MachineSets to conform with the upstream keys, while also continuing to accept the openshift specific keys that we have been using.

      Why is this important?

      • This change makes our implementation of the cluster autoscaler conform to the API that is described in the upstream community. This reduces the mental overhead for someone that knows kubernetes but is new to openshift.
      • This change also reduces the maintenance burden that we carry in the form of addition patches to the cluster autoscaler. By changing our controllers to understand the upstream annotations we are able to remove extra patches on our fork of the cluster autoscaler, making future maintenance easier and closer to the upstream source.

      Scenarios

      1. A user is debugging a cluster autoscaler issue by examining the related MachineSet objects, they see the scale from zero annotations and recognize them from the project documentation and from upstream discussions. The result is that the user is more easily able to find common issues and advice from the upstream community.
      2. An openshift maintainer is updating the cluster autoscaler for a new version of kubernetes, because the openshift controllers understand the upstream annotations, the maintainer does not need to carry or modify a patch to support multiple varieties of annotation. This in turn makes the task of updating the autoscaler simpler and reduces burden on the maintainer.

      Acceptance Criteria

      • CI - MUST be running successfully with tests automated
      • Release Technical Enablement - Provide necessary release enablement details and documents.
      • Scale from zero autoscaling must continue to work with both the old openshift annotations and the newer upstream annotations.

      Dependencies (internal and external)

      1. ...

      Previous Work (Optional):

      1. โ€ฆ

      Open questions::

      1. โ€ฆ

      Done Checklist

      • CI - CI is running, tests are automated and merged.
      • Release Enablement <link to Feature Enablement Presentation>
      • DEV - OpenShift code and tests merged: <link to meaningful PR or GitHub Issue>
      • DEV - OpenShift documentation merged: <link to meaningful PR or GitHub Issue>
      • DEV - OpenShift build attached to advisory: <link to errata>
      • QE - Test plans in Polarion: <link or reference to Polarion>
      • QE - Automated tests merged: <link or reference to automated tests>
      • DOC - OpenShift documentation merged: <link to meaningful PR>

      please note, the changes described by this epic will happen in OpenShift controllers and as such there is no "upstream" relationship in the same sense as the Kubernetes-based controllers.

            [OCPCLOUD-2136] Update autoscaling annotations to accommodate upstream keys

            reviewing the individual machineset actuator update cards today during refinement. we are not sure that we want to do the work for all the individual machineset changes as they might not be needed.

            we have lowered the priority on the machineset changes to minor and will review with our plans for upstream feature parity around the scale from zero annotations before we do anything else with those cards.

            Michael McCune added a comment - reviewing the individual machineset actuator update cards today during refinement. we are not sure that we want to do the work for all the individual machineset changes as they might not be needed. we have lowered the priority on the machineset changes to minor and will review with our plans for upstream feature parity around the scale from zero annotations before we do anything else with those cards.

            Is it https://issues.redhat.com/browse/OCPCLOUD-2500 and then all the provider updates, and then https://issues.redhat.com/browse/OCPCLOUD-2146

            you've got it exactly

            Michael McCune added a comment - Is it https://issues.redhat.com/browse/OCPCLOUD-2500 and then all the provider updates, and then https://issues.redhat.com/browse/OCPCLOUD-2146 you've got it exactly

            Joel Speed added a comment -

            mimccune@redhat.com raryan@redhat.com Can you help me to understand what the path to getting this epic completed is? Is it https://issues.redhat.com/browse/OCPCLOUD-2500 and then all the provider updates, and then https://issues.redhat.com/browse/OCPCLOUD-2146

            Joel Speed added a comment - mimccune@redhat.com raryan@redhat.com Can you help me to understand what the path to getting this epic completed is? Is it https://issues.redhat.com/browse/OCPCLOUD-2500 and then all the provider updates, and then https://issues.redhat.com/browse/OCPCLOUD-2146

            Yang Yang added a comment -

            Hi joelspeed The target version is still 4.16. Should we drop it or move it to 4.17?

            Yang Yang added a comment - Hi joelspeed The target version is still 4.16. Should we drop it or move it to 4.17?

            Joel Speed added a comment -

            We spoke about this yesterday within the team, what's been done so far won't impact the functionality as is. We will continue to work on this in 4.17, have adjusted the candidate version to reflect this.

            Joel Speed added a comment - We spoke about this yesterday within the team, what's been done so far won't impact the functionality as is. We will continue to work on this in 4.17, have adjusted the candidate version to reflect this.

            Huali Liu added a comment -

            Hi mimccune@redhat.com Lots of stories are still todo. Should they be moved to 4.17 epic?

            Huali Liu added a comment - Hi mimccune@redhat.com Lots of stories are still todo. Should they be moved to 4.17 epic?

            we have the changes for the CAO mostly complete at this point. this will ensure that we have both sets of annotations available on the machinesets.

            we have not updated the CAS yet (OCPCLOUD-2500) which is what we will need to add so that we can eventually remove the old annotations. but, for the 4.16 release, it is fine if we are lagging on the CAS implemetation as this not a user facing feature and does not represent a new feature. this is technical debt cleanup and deprecation of the old annotations.

            i think we will need to complete this work in 4.17

            Michael McCune added a comment - we have the changes for the CAO mostly complete at this point. this will ensure that we have both sets of annotations available on the machinesets. we have not updated the CAS yet ( OCPCLOUD-2500 ) which is what we will need to add so that we can eventually remove the old annotations. but, for the 4.16 release, it is fine if we are lagging on the CAS implemetation as this not a user facing feature and does not represent a new feature. this is technical debt cleanup and deprecation of the old annotations. i think we will need to complete this work in 4.17

            joelspeed imo, we need to focus on getting OCPCLOUD-2493 done for this release. that should be our primary focus as once we have both sets of annotations in place we can then start the process of working with the individual providers.

            raryan@redhat.com is actively working the 2493 card, i think we should be able to get it merged in the next 2 sprints.

            Michael McCune added a comment - joelspeed imo, we need to focus on getting OCPCLOUD-2493 done for this release. that should be our primary focus as once we have both sets of annotations in place we can then start the process of working with the individual providers. raryan@redhat.com is actively working the 2493 card, i think we should be able to get it merged in the next 2 sprints.

            Joel Speed added a comment -

            mimccune@redhat.com We have about 2 sprints left to complete this, is the core of the work coming to a close so that we can start updating each of the individual providers? Can you please provide a quick update on progress and next steps?

            Joel Speed added a comment - mimccune@redhat.com We have about 2 sprints left to complete this, is the core of the work coming to a close so that we can start updating each of the individual providers? Can you please provide a quick update on progress and next steps?

            joelspeed and i had a discussion about this today after i discovered some issues while doing the cluster autoscaler rebase for 1.29. here is our plan going forward:

            1. for 4.16 release, we will add a patch to allow the CAS to understand both annotations with a preference for upstream
            2. for 4.16 the CAO will add the upstream annotations
            3. for 4.17 the CAS will only read the upstream annotations
            4. for 4.17+ the CAO will continue to convert annotations

             

            need to check if the CAO is currently using the "generic" option from the reconcile of MachineAutoscalers, we will need this to ensure that records get converted on a cluster version upgrade.

            Michael McCune added a comment - joelspeed and i had a discussion about this today after i discovered some issues while doing the cluster autoscaler rebase for 1.29. here is our plan going forward: for 4.16 release, we will add a patch to allow the CAS to understand both annotations with a preference for upstream for 4.16 the CAO will add the upstream annotations for 4.17 the CAS will only read the upstream annotations for 4.17+ the CAO will continue to convert annotations   need to check if the CAO is currently using the "generic" option from the reconcile of MachineAutoscalers, we will need this to ensure that records get converted on a cluster version upgrade.

              raryan@redhat.com Rachel Ryan
              mimccune@redhat.com Michael McCune
              Huali Liu Huali Liu
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

                Created:
                Updated: