Details
-
Bug
-
Resolution: Not a Bug
-
Undefined
-
None
-
4.13.0
-
No
-
False
-
Description
Description of problem:
Agent-based installer - worker node shows connection refused errors while reaching to the assisted service running on the rendezvousIP (Node0)~~~ May 11 21:18:46 dhcp192-168-3-37.example.com systemd[1]: Starting Assisted Installer Agent... May 11 21:18:46 dhcp192-168-3-37.example.com extract-agent.sh[1750]: Pulling quay.io/openshift-release-dev/ocp-release@sha256:aae5131ec824c301c11d0bf11d81b3996a222be8b49ce4716e9d464229a2f92> . . . May 11 21:18:59 dhcp192-168-3-37.example.com extract-agent.sh[1929]: Storing signatures May 11 21:19:00 dhcp192-168-3-37.example.com systemd[1]: Started Assisted Installer Agent. May 11 21:19:00 dhcp192-168-3-37.example.com start-agent.sh[2189]: Waiting for infra-env-id to be available May 11 21:19:05 dhcp192-168-3-37.example.com start-agent.sh[2189]: Querying assisted-service for infra-env-id... May 11 21:19:05 dhcp192-168-3-37.example.com start-agent.sh[2215]: curl: (7) Failed to connect to <rendezvousIP> port 8090: Connection refused ~~~ The rendezvous host (Node0) has rebooted at this stage and joined the cluster as a control plane node. But the worker node is still trying to reach rendezvousIP on port 8090 and getting connection refused errors. ~~~ # oc get node NAME STATUS ROLES AGE VERSION dhcp192-168-3-225.example.com Ready control-plane,master,worker 42m v1.26.3+b404935 dhcp192-168-3-228.example.com Ready control-plane,master,worker 69m v1.26.3+b404935 dhcp192-168-3-97.example.com Ready control-plane,master,worker 69m v1.26.3+b404935 ~~~ The process for adding a worker node to an agent-based cluster is not clearly documented in the available resources. Additionally, there is no documentation available for adding a node to the cluster beyond the initial deployment phase. This can make it difficult for users to expand their cluster as needed, and may cause confusion or delays in the deployment process I'd open a seperate issue for as a doc bug.
Version-Release number of selected component (if applicable):
4.13.0
How reproducible:
Steps to Reproduce:
1. Create agent iso with 3 master and 3 worker nodes. 2. Boot the node0 and 2 master nodes with the agent iso and wait for the masters nodes to become Ready. 3. Boot the worker node with agent iso and observe the journal logs on worker nodes.
Actual results:
Expected results:
Additional info: