Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-13559

Assisted-service connection refused error during Agent-Based Installer worker node addition

    XMLWordPrintable

Details

    • No
    • False
    • Hide

      None

      Show
      None

    Description

      Description of problem:

      Agent-based installer - worker node shows connection refused errors while reaching to the assisted service running on the rendezvousIP (Node0)~~~
      May 11 21:18:46 dhcp192-168-3-37.example.com systemd[1]: Starting Assisted Installer Agent...
      May 11 21:18:46 dhcp192-168-3-37.example.com extract-agent.sh[1750]: Pulling quay.io/openshift-release-dev/ocp-release@sha256:aae5131ec824c301c11d0bf11d81b3996a222be8b49ce4716e9d464229a2f92>
      .
      .
      .
      May 11 21:18:59 dhcp192-168-3-37.example.com extract-agent.sh[1929]: Storing signatures
      May 11 21:19:00 dhcp192-168-3-37.example.com systemd[1]: Started Assisted Installer Agent.
      May 11 21:19:00 dhcp192-168-3-37.example.com start-agent.sh[2189]: Waiting for infra-env-id to be available
      May 11 21:19:05 dhcp192-168-3-37.example.com start-agent.sh[2189]: Querying assisted-service for infra-env-id...
      May 11 21:19:05 dhcp192-168-3-37.example.com start-agent.sh[2215]: curl: (7) Failed to connect to <rendezvousIP> port 8090: Connection refused
      ~~~
      
      The rendezvous host (Node0) has rebooted at this stage and joined the cluster as a control plane node. But the worker node is still trying to reach rendezvousIP on port 8090 and getting connection refused errors.
      
      ~~~
      # oc get node
      NAME                                              STATUS   ROLES                         AGE   VERSION
      dhcp192-168-3-225.example.com   Ready    control-plane,master,worker   42m   v1.26.3+b404935
      dhcp192-168-3-228.example.com   Ready    control-plane,master,worker   69m   v1.26.3+b404935
      dhcp192-168-3-97.example.com    Ready    control-plane,master,worker   69m   v1.26.3+b404935
      ~~~
      
      The process for adding a worker node to an agent-based cluster is not clearly documented in the available resources. Additionally, there is no documentation available for adding a node to the cluster beyond the initial deployment phase. This can make it difficult for users to expand their cluster as needed, and may cause confusion or delays in the deployment process I'd open a seperate issue for as a doc bug.

      Version-Release number of selected component (if applicable):

      4.13.0

      How reproducible:

       

      Steps to Reproduce:

      1. Create agent iso with 3 master and 3 worker nodes.
      2. Boot the node0 and 2 master nodes with the agent iso and wait for the masters nodes to become Ready.
      3. Boot the worker node with agent iso and observe the journal logs on worker nodes.
      

      Actual results:

       

      Expected results:

       

      Additional info:

       

      Attachments

        Activity

          People

            beth.white Beth White
            rhn-support-asadawar Abhijeet Sadawarte
            Manoj Hans Manoj Hans
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: