Uploaded image for project: 'Red Hat OpenShift Dev Spaces (formerly CodeReady Workspaces) '
  1. Red Hat OpenShift Dev Spaces (formerly CodeReady Workspaces)
  2. CRW-3272

[RN] Che Operator sometimes deploys DevWorkspace Operator even if it is present in the cluster

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Major Major
    • 3.2.0.GA
    • 3.2.0.GA
    • docs
    • False
    • None
    • False
    • Release Notes
    • Hide
      = DevWorkspace Operator crashloop due to multiple deployments

      With this update, the OpenShift Lifecycle Manager treats the DevWorkspace Operator as a {prod-short} dependency, and the Red Hat OpenShift Dev Spaces Operator no longer deploys or manages the DevWorkspace Operator. This enhancement prevents occurrence of multiple deployments of the DevWorkspace Operator and avoids broken webhooks in an unsupported additional DevWorkspace Operator in the `devworkspace-controller` namespace.
      Show
      = DevWorkspace Operator crashloop due to multiple deployments With this update, the OpenShift Lifecycle Manager treats the DevWorkspace Operator as a {prod-short} dependency, and the Red Hat OpenShift Dev Spaces Operator no longer deploys or manages the DevWorkspace Operator. This enhancement prevents occurrence of multiple deployments of the DevWorkspace Operator and avoids broken webhooks in an unsupported additional DevWorkspace Operator in the `devworkspace-controller` namespace.
    • Bug Fix
    • Done

      Synced from Eclipse Che issue

      https://github.com/eclipse/che/issues/21603

      Describe the bug

      When starting, the Che Operator checks to see if the DevWorkspace Operator is installed by searching for a ClusterServiceVersion for it. However, this check is performed only against the openshift-operators namespace (see const).

      The way OLM works is that CSVs are copied to all namespaces from the namespace in which the operator is installed. This could mean that the check in Che Operator is performed before this copying is completed, e.g. if the DevWorkspace Operator is updated.

      While I'm not entirely sure, I believe we encountered this case while updating the Che Operator and DevWorkspace Operator in a cluster we manage -- both operators are installed in a custom namespace (i.e. not openshift-operators) and we found that the Che Operator attempted to deploy the DevWorkspace Operator to the devworkspace-controller namespace. Since only one instance of DWO can run in a cluster at any time, this caused a crashloop in the second deployment and interfered with webhook certificates for DWO.

      Che version

      next (development version)

      Steps to reproduce

      Not consistent reproducer, but my intuition is something like

      1. Install DevWorkspace Operator via OLM with manual updates to namespace test-namespace, one version before the latest (e.g. v0.15.1) so that an update is available (e.g. v0.15.2)
      2. Install Che Operator to test-namespace
      3. Trigger DevWorkspace Operator upgrade while Che Operator pod is starting

      but I'm not entirely sure how OLM manages creation/update of CSVs so this may not work.

      Expected behavior

      We should do three things:

      1. (simple) Instead of hard-coding the CSV check to openshift-operators, the Che Operator should check its current installed namespace for the DWO CSV. For most cases, this would be openshift-operators anyways, but when Che Operator is installed in an alternate namespace we'd expect DWO to be installed there as well.
      2. (more complex) Improve the check to be more robust and avoid false-negatives
      3. Add a configuration option to the CheCluster to disable automatically installing DevWorkspace Operator for clusters where the admin is sure DWO is installed.
      4. When installing DWO in cases where it is necessary, the Che Operator should not create CRDs if they already exist, as this can overwrite the certificate used for conversion webhooks and leave CRDs in a broken state.

      Runtime

      OpenShift

      Screenshots

      No response

      Installation method

      OperatorHub

      Environment

      Dev Sandbox (workspaces.openshift.com)

      Eclipse Che Logs

      No response

      Additional context

      Code that checks if DWO is installed: https://github.com/eclipse-che/che-operator/blob/bd588312a27f3b82402b7bb8644373206f9f27f9/pkg/deploy/dev-workspace/dev_workspace_utils.go#L39-L50

      Release Notes Text

      On OpenShift the DevWorkspace Operator is managed by OLM as a Eclipse Che dependency. As a consequence Eclipse Che Operator should not manage the DevWorkspace Operator when it's installed on OpenShift. But that was not the case and it has been fixed.

              jvrbkova@redhat.com Jana Vrbkova
              jiralint.codeready Bot Codeready
              Jana Vrbkova Jana Vrbkova
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: