-
Bug
-
Resolution: Done
-
Major
-
3.2.0.GA
-
False
-
None
-
False
-
Release Notes
-
-
Bug Fix
-
Done
Synced from Eclipse Che issue
https://github.com/eclipse/che/issues/21603
Describe the bug
When starting, the Che Operator checks to see if the DevWorkspace Operator is installed by searching for a ClusterServiceVersion for it. However, this check is performed only against the openshift-operators namespace (see const).
The way OLM works is that CSVs are copied to all namespaces from the namespace in which the operator is installed. This could mean that the check in Che Operator is performed before this copying is completed, e.g. if the DevWorkspace Operator is updated.
While I'm not entirely sure, I believe we encountered this case while updating the Che Operator and DevWorkspace Operator in a cluster we manage -- both operators are installed in a custom namespace (i.e. not openshift-operators) and we found that the Che Operator attempted to deploy the DevWorkspace Operator to the devworkspace-controller namespace. Since only one instance of DWO can run in a cluster at any time, this caused a crashloop in the second deployment and interfered with webhook certificates for DWO.
Che version
next (development version)
Steps to reproduce
Not consistent reproducer, but my intuition is something like
- Install DevWorkspace Operator via OLM with manual updates to namespace test-namespace, one version before the latest (e.g. v0.15.1) so that an update is available (e.g. v0.15.2)
- Install Che Operator to test-namespace
- Trigger DevWorkspace Operator upgrade while Che Operator pod is starting
but I'm not entirely sure how OLM manages creation/update of CSVs so this may not work.
Expected behavior
We should do three things:
- (simple) Instead of hard-coding the CSV check to openshift-operators, the Che Operator should check its current installed namespace for the DWO CSV. For most cases, this would be openshift-operators anyways, but when Che Operator is installed in an alternate namespace we'd expect DWO to be installed there as well.
- (more complex) Improve the check to be more robust and avoid false-negatives
- Add a configuration option to the CheCluster to disable automatically installing DevWorkspace Operator for clusters where the admin is sure DWO is installed.
- When installing DWO in cases where it is necessary, the Che Operator should not create CRDs if they already exist, as this can overwrite the certificate used for conversion webhooks and leave CRDs in a broken state.
Runtime
OpenShift
Screenshots
No response
Installation method
OperatorHub
Environment
Dev Sandbox (workspaces.openshift.com)
Eclipse Che Logs
No response
Additional context
Code that checks if DWO is installed: https://github.com/eclipse-che/che-operator/blob/bd588312a27f3b82402b7bb8644373206f9f27f9/pkg/deploy/dev-workspace/dev_workspace_utils.go#L39-L50
Release Notes Text
On OpenShift the DevWorkspace Operator is managed by OLM as a Eclipse Che dependency. As a consequence Eclipse Che Operator should not manage the DevWorkspace Operator when it's installed on OpenShift. But that was not the case and it has been fixed.