Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-56988

Hosted control plane for KubeVirt doesn't fully deploy (MCE 2.8.1)

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Undefined Undefined
    • None
    • 4.18.0
    • None
    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Using multicluster-engine operator version 2.8.1, I tried to create an HCP cluster on KubeVirt. Unfortunately, the hosted control plane failed to fully deploy due to:

      $ oc logs -n hosted-clusters-mycluster20 control-plane-operator-7ff7c886fb-4qsfj
      ...
      {"level":"error","ts":"2025-06-02T17:30:24Z","msg":"Reconciler error","controller":"hostedcontrolplane","controllerGroup":"hypershift.openshift.io","controllerKind":"HostedControlPlane","HostedControlPlane":{"name":"mycluster20","namespace":"hosted-clusters-mycluster20"},"namespace":"hosted-clusters-mycluster20","name":"mycluster20","reconcileID":"59921662-7925-47c2-a569-4b0596d9ff56","error":"failed to update control plane: failed to reconcile csi driver: roles.rbac.authorization.k8s.io \"kubevirt-csi\" is forbidden: user \"system:serviceaccount:hosted-clusters-mycluster20:control-plane-operator\" (groups=[\"system:serviceaccounts\" \"system:serviceaccounts:hosted-clusters-mycluster20\" \"system:authenticated\"]) is attempting to grant RBAC permissions not currently held:\n{APIGroups:[\"subresources.kubevirt.io\"], Resources:[\"virtualmachines/addvolume\"], Verbs:[\"update\"]}\n{APIGroups:[\"subresources.kubevirt.io\"], Resources:[\"virtualmachines/removevolume\"], Verbs:[\"update\"]}","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/hypershift/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:324\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/hypershift/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:261\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/hypershift/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:222"}
      ... 

      One of the symptoms of the issue is that the hosted cluster remains partially deployed and the worker nodes can't join the cluster:

      $ oc get hostedcluster -A
      NAMESPACE         NAME          VERSION   KUBECONFIG                     PROGRESS   AVAILABLE   PROGRESSING   MESSAGE
      hosted-clusters   mycluster20             mycluster20-admin-kubeconfig   Partial    True        False         The hosted control plane is available
      

      To fix the issue, I added the RBAC permissions mentioned in the error message above:

      apiVersion: rbac.authorization.k8s.io/v1
      kind: Role
      metadata:
        name: control-plane-operator-fix
        namespace: hosted-clusters-mycluster20
      rules:
      - apiGroups: ["subresources.kubevirt.io"]
        resources: ["virtualmachines/addvolume", "virtualmachines/removevolume"]
        verbs: ["update"]
      apiVersion: rbac.authorization.k8s.io/v1
      kind: RoleBinding
      metadata:
        name: control-plane-operator-fix
        namespace: hosted-clusters-mycluster20
      roleRef:
        apiGroup: rbac.authorization.k8s.io
        kind: Role
        name: control-plane-operator-fix
      subjects:
      - kind: ServiceAccount
        name: control-plane-operator

      After fixing up the RBAC, the hosted control plane fully deployed and the cluster provisioned successfully. In the following output you can see the hosted control plane components that were deployed before applying the fix and the control plane components that were added after applying the fix (see the AGE column to distinguish the two groups):

      $ oc get deploy -n hosted-clusters-mycluster20
      NAME                                 READY   UP-TO-DATE   AVAILABLE   AGE
      capi-provider                        1/1     1            1           28m
      catalog-operator                     1/1     1            1           27m
      certified-operators-catalog          1/1     1            1           27m
      cluster-api                          1/1     1            1           28m
      cluster-autoscaler                   0/0     0            0           9m3s
      cluster-image-registry-operator      1/1     1            1           27m
      cluster-network-operator             1/1     1            1           27m
      cluster-node-tuning-operator         1/1     1            1           27m
      cluster-policy-controller            1/1     1            1           27m
      cluster-storage-operator             1/1     1            1           27m
      cluster-version-operator             1/1     1            1           27m
      community-operators-catalog          1/1     1            1           27m
      control-plane-operator               1/1     1            1           28m
      control-plane-pki-operator           1/1     1            1           28m
      csi-snapshot-controller              1/1     1            1           9m1s
      csi-snapshot-controller-operator     1/1     1            1           9m4s
      dns-operator                         1/1     1            1           27m
      hosted-cluster-config-operator       1/1     1            1           27m
      ignition-server                      1/1     1            1           27m
      ignition-server-proxy                1/1     1            1           27m
      ingress-operator                     1/1     1            1           27m
      konnectivity-agent                   1/1     1            1           27m
      kube-apiserver                       1/1     1            1           27m
      kube-controller-manager              1/1     1            1           27m
      kube-scheduler                       1/1     1            1           27m
      kubevirt-cloud-controller-manager    1/1     1            1           27m
      kubevirt-csi-controller              1/1     1            1           9m4s
      machine-approver                     1/1     1            1           9m3s
      multus-admission-controller          1/1     1            1           8m21s
      network-node-identity                1/1     1            1           8m8s
      oauth-openshift                      1/1     1            1           27m
      olm-operator                         1/1     1            1           27m
      openshift-apiserver                  1/1     1            1           27m
      openshift-controller-manager         1/1     1            1           27m
      openshift-oauth-apiserver            1/1     1            1           27m
      openshift-route-controller-manager   1/1     1            1           27m
      ovnkube-control-plane                1/1     1            1           8m13s
      packageserver                        1/1     1            1           27m
      redhat-marketplace-catalog           1/1     1            1           27m
      redhat-operators-catalog             1/1     1            1           27m

              ocohen@redhat.com Oren Cohen
              anosek@redhat.com Ales Nosek
              None
              None
              Yu Li Yu Li
              None
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated: