-
Epic
-
Resolution: Unresolved
-
Normal
-
None
-
None
-
Change Upgrade Behavior to align with OCP guidelines
-
Product / Portfolio Work
-
False
-
-
False
-
Not Selected
-
To Do
-
Informational
Clarification:
From original sender of the email
It applies only to "the cluster Operators shipped by Red Hat that serve as the architectural foundation for OpenShift Container Platform. Cluster Operators are installed by default, unless otherwise noted, and are managed by the Cluster Version Operator (CVO)." See the doc for details. They can be listed by the command "oc get co" against an OpenShift cluster.
So we are not required to do this and not in the timeframe mentioned.
BUT
If this behavior resolves some of our existing issues, we should / can consider adopting this.
As a RH operator, we need to adhere to following new upgrade behavior guideline.
Our solution, detailed in https://github.com/openshift/api/pull/2469 defines clear, expected operator behavior during an upgrade: # Before Upgrade: Available=True, Progressing=False, Degraded=False.
- During Upgrade: The operator MUST become Progressing=True (since a version change is a configuration change). The operator MUST NOT become Degraded=True or Available=False in HA Control Planes UNLESS there’s a critical failure requiring admin action. SNO / Non HA Control Planes not currently a focus.
- After Upgrade: Return to the healthy, stable state: Available=True, Progressing=False, Degraded=False.
We are also setting time limits for an operator to complete its rollout: a component in a cluster of less than 250 nodes must complete a version update within 20 minutes, the machine-config-operator must complete the version update within 90 minutes.
Call for Action
We need you to adhere to the API rules(implemented by openshift/api#2469) in 4.22 by fixing the bugs reported to your component in OCPSTRAT-2484.