-
Feature Request
-
Resolution: Unresolved
-
Undefined
-
None
-
None
-
None
-
False
-
None
-
False
-
Not Selected
-
x86_64
-
-
-
Proposed title of this feature request
improved error handling in OLM operator subscriptions
What is the nature and description of the request?
In order to recover from a failed operator install or upgrade manual steps have to be carried out as indicated by. https://access.redhat.com/solutions/6459071. This manual process is not feasible especially in a Cloud RAN network network with 10,000+ sites.
Why does the customer need this? (List the business requirements here)
Cloud RAN is expected to have tens of thousands of sites. During an upgrade transient network related issues can impact the download of images for a short period. This interruption severely impacts OLM operator upgrades or even installs. Having to manually recover 1000's of sites is operationally not feasible especially in stringent maintenance windows.
The current behavior for operator subscriptions/installs and recover seem orthogonal to the kubernetes best practices and reconciliation. And especially with all the emphasis on gitops and automation.
Verizon and other customers expect a better handling especially when it comes to error handling.
List any affected packages or components.
OLM