Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-55149

Hypershift agent based installation on Vmware Vsphere OCP cluster causes capi-provider pod crash

XMLWordPrintable

    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Description of problem:

          Hypershift agent based installation on Vmware Vsphere OCP cluster causes capi-provider pod crash

      Version-Release number of selected component (if applicable):

      4.18    

      How reproducible:

          100%

      Steps to Reproduce:

      Once the hosts get bound to HCP cluster, the nodepool shows incorrect status.
      
      The machines do not show any NodeName.
      
      The capi-provider pod also crashes.
      
      # oc get agents -A
      NAMESPACE   NAME                                   CLUSTER        APPROVED   ROLE     STAGE
      hcp-agent   465b0642-2976-ee47-26a5-b27ffe8e8208   dpateriy-hcp   true       worker   Done
      hcp-agent   cbb00642-4351-8f7a-b2ec-116afb2e863b   dpateriy-hcp   true       worker   Done
      
      
      # oc get machines -A
      NAMESPACE                    NAME                 CLUSTER              NODENAME   PROVIDERID                                     PHASE         AGE     VERSION
      hcp-namespace-dpateriy-hcp   dpateriy-hcp-wcjxz   dpateriy-hcp-86b8l              agent://cbb00642-4351-8f7a-b2ec-116afb2e863b   Provisioned   3h54m   4.18.4
      hcp-namespace-dpateriy-hcp   dpateriy-hcp-wjrc7   dpateriy-hcp-86b8l              agent://465b0642-2976-ee47-26a5-b27ffe8e8208   Provisioned   3h54m   4.18.4
      
      # oc logs capi-provider-658b97547f-tc6tf -n hcp-namespace-dpateriy-hcp
      
      
      2025-04-18T16:59:15Z    ERROR    Reconciler error    {"controller": "agentmachine", "controllerGroup": "capi-provider.agent-install.openshift.io", "controllerKind": "AgentMachine", "AgentMachine": {"name":"dpateriy-hcp-wcjxz","namespace":"hcp-namespace-dpateriy-hcp"}, "namespace": "hcp-namespace-dpateriy-hcp", "name": "dpateriy-hcp-wcjxz", "reconcileID": "b958985c-56aa-453c-a6fb-af2a29b8396d", "error": "failed to find node with name 00-50-56-86-2a-c3", "errorVerbose": "failed to find node with name 00-50-56-86-2a-c3\ngithub.com/openshift/cluster-api-provider-agent/controllers.(*NodeProviderIDReconciler).setNodeProviderID\n\t/remote-source/app/controllers/node_provider_id_controller.go:96\ngithub.com/openshift/cluster-api-provider-agent/controllers.(*NodeProviderIDReconciler).Reconcile\n\t/remote-source/app/controllers/node_provider_id_controller.go:70\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:119\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:316\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:227\nruntime.goexit\n\t/usr/lib/golang/src/runtime/asm_amd64.s:1700"}
      
      
      

      Actual results:

          The HostedCluster agent based deployment is not complete.

      Expected results:

           The HostedCluster deployment should be successfull, nodepool should show the nodes getting registered.
      
      The machines should have nodename in reference.

      Additional info:

         MCE Must-gather link https://drive.google.com/file/d/1kVJHFBeCfu-F9RxPCwajSsko6NH-y5aJ/view?usp=drive_link
      
      
      
      hcp-namespace-dpateriy-hcp project inspect report: https://drive.google.com/file/d/1ZlUBs4DHU64hUXHXoS_VW-VA0rYlKyeq/view?usp=sharing

       

              cchun@redhat.com Crystal Chun
              rhn-support-dpateriy Divyam Pateriya
              None
              None
              Elsa Passaro Elsa Passaro
              None
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated: