Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-37638

Adding node as day-2 operation for hosted control planes fails troubleshooting

XMLWordPrintable

    • None
    • OSDOCS Sprint 261, OSDOCS Sprint 262
    • 2
    • False
    • Hide

      None

      Show
      None

      Doc Content:

      Suggested troubleshooting topic title: Nodes fail to be added to hosted control planes using Agent (assisted-installer)

      Suggested content:

      Look at the assisted-service logs:

      $ oc logs -n multicluster-engine <assisted-service pod name>

      In the logs, find error(s) that resemble the following:

      error="failed to get pull secret for update: invalid pull secret data in secret pull-secret 
      pull secret must contain auth for \"registry.redhat.io\""  

      To fix this, refer to (link to doc created in https://issues.redhat.com/browse/OSDOCS-12373)

      Latest description:

      While scaling up a hosted control plane cluster using nodes provisioned by Assisted Installer, the host fails to pull the ignition with a URL containing the port 22624. This URL is invalid in the hosted control plane scenario and indicates that there's an issue with the cluster. 

      To determine the issue, look at the assisted-service logs

      $ oc logs -n multicluster-engine <assisted-service pod name>
      

      In this case, there was a missing pull secret for registry.redhat.io

      2024-10-14T13:26:40.882938925+08:00 time="2024-10-14T05:26:40Z" level=error msg="failed to update cluster" func="github.com/openshift/assisted-service/internal/controller/controllers.(*ClusterDeploymentsReconciler).Reconcile" file="/remote-source/assisted-service/app/internal/controller/controllers/clusterdeployments_controller.go:221" agent_cluster_install=hosted-cluster agent_cluster_install_namespace=hcp-hosted-cluster cluster_deployment=hosted-cluster cluster_deployment_namespace=hcp-hosted-cluster error="failed to get pull secret for update: invalid pull secret data in secret pull-secret {\"auths\": {\"mirror-registry.domain0000000001:8443\": {\"auth\": \"aW5pdDpQQHNzdzByZDEyMzQ1Ng==\",\"email\": \"\"}}}\n: pull secret must contain auth for \"registry.redhat.io\""  

      To fix this, either add the authentication information for regsitry.redhat.io to your pull secret or add the registry url to the AgentServiceConfig's spec.unauthenticatedRegistries

      apiVersion: agent-install.openshift.io/v1beta1
      kind: AgentServiceConfig
      metadata:
        name: agent
      spec:
        unauthenticatedRegistries:
        - registry.redhat.io  

       

       

      Prior issue description:


      Description of problem:

      While adding nodes to a cluster installed via Assisted Installer. This is a 4.14 cluster. 
      The document [1] suggests use of 22623 port for machine config being used to fetch the ignition. 
      
      [1] https://docs.openshift.com/container-platform/4.16/installing/installing_platform_agnostic/installing-platform-agnostic.html
      
      Hence, the customer updated their load balancer accordingly, but this fails as nodes fetch ignitions via 22624 port. 
      
      For scale up via Assisted Installer, found the document [2] to be used to add the certificates and ignition endpoint. But, the customer is interested in a documentation so that they can open ports accordingly.    
      
      [2] https://github.com/openshift/assisted-service/blob/76bd57a148063524b7bffb766eee3f63710a83ae/docs/hive-integration/crds/agentClusterInstall-with-ignitionEndpoint.yaml#L24-L28   
      
      AI 2024 documentation has details on 22624 port in troubleshooting section.
      
      https://docs.redhat.com/en/documentation/assisted_installer_for_openshift_container_platform/2024/html-single/installing_openshift_container_platform_with_the_assisted_installer/index#api-connectivity-failure_troubleshooting  
      
      Similarly, It will be beneficial for customer to have the same information available in ACM and HCP documentation.
       
      Since HCP deployments uses assisted service, It will be helpful to have information about 22624 port for day-2 operations in HCP documentation.
      
      ACM:
      https://docs.redhat.com/en/documentation/red_hat_advanced_cluster_management_for_kubernetes/2.11/html/clusters/cluster_mce_overview?extIdCarryOver=true&sc_cid=701f2000001Css5AAC#firewall-port-reqs-bare-metal
      
      HCP
      https://docs.redhat.com/en/documentation/red_hat_advanced_cluster_management_for_kubernetes/2.11/html-single/clusters/index#hosted-control-requirements

      Version-Release number of selected component (if applicable):

      4.14,4.15,4,16    

      How reproducible:

          

      Steps to Reproduce:

          1.
          2.
          3.
          

      Actual results:

          

      Expected results:

          

      Additional info:

          

              rhn-support-lahinson Laura Hinson
              rhn-support-chdeshpa Chinmay Deshpande
              Votes:
              2 Vote for this issue
              Watchers:
              12 Start watching this issue

                Created:
                Updated: