Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-32105

The third master is not joining to the cluster on an Agent Based Installations

    XMLWordPrintable

Details

    • No
    • False
    • Hide

      None

      Show
      None
    • Hide
      *Cause*: A rare race between a process running on the bootstrap host being installed and one running in the cluster control plane as it comes up to update the installation progress of the control plane hosts prevents the former process from completing.
      *Consequence*: The final control plane host is never rebooted to join the cluster as a node.
      *Fix*: Excessive retries were removed from the process, so that if a conflict occurs it is resolved in a timely fashion and installation proceeds with rebooting the final control plane host.
      *Result*: Even when the process on the control plane updates the status first, installation of the final host proceeds and all control plane nodes are installed.
      Show
      *Cause*: A rare race between a process running on the bootstrap host being installed and one running in the cluster control plane as it comes up to update the installation progress of the control plane hosts prevents the former process from completing. *Consequence*: The final control plane host is never rebooted to join the cluster as a node. *Fix*: Excessive retries were removed from the process, so that if a conflict occurs it is resolved in a timely fashion and installation proceeds with rebooting the final control plane host. *Result*: Even when the process on the control plane updates the status first, installation of the final host proceeds and all control plane nodes are installed.
    • Bug Fix

    Description

      After performing an Agent Based Installation on Baremetal, the master node which was initially the rendezvous host is not joining to the cluster.

      Checking podman containers on this node we see that 'assisted-installer' pod appears with 143 exit code after the second master is detected as ready:

      2024-04-01T15:21:14.677437000Z time="2024-04-01T15:21:14Z" level=info msg="Found 1 ready master nodes"
      2024-04-01T15:21:19.684831000Z time="2024-04-01T15:21:19Z" level=info msg="Found a new ready master node <second-master> with id <master-id>" 

      podman pods status:

      $ podman ps -a
      CONTAINER ID  IMAGE                                                                                                                   COMMAND               CREATED         STATUS                     PORTS       NAMES
      20b338ab8906  localhost/podman-pause:4.4.1-1707368644                                                                                                       16 hours ago    Up 16 hours                            d2b97e733b33-infra
      0876c611f655  quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:27c5328e1d9a0d7db874c6e52efae631ab3c29a3d4da50c50b2e783dcb784128  /bin/bash start_d...  16 hours ago    Up 16 hours                            assisted-db
      a9a116bed3a7  quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:27c5328e1d9a0d7db874c6e52efae631ab3c29a3d4da50c50b2e783dcb784128  /assisted-service     16 hours ago    Up 16 hours                            service
      0afbe44c2cf2  quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:27c5328e1d9a0d7db874c6e52efae631ab3c29a3d4da50c50b2e783dcb784128  /usr/local/bin/ag...  16 hours ago    Exited (0) 16 hours ago                apply-host-config
      45da1bdf2440  quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:4b3daca74ad515845d5f8dcf384f0e51d58751a2785414edc3f20969a6fc0403  next_step_runner ...  16 hours ago    Up 16 hours                            next-step-runner
      8d1306b0ea3a  quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:79e97d8cbd27e2c7402f7e016de97ca2b1f4be27bd52a981a27e7a2132be1ef4  --role bootstrap ...  16 hours ago    Exited (143) 15 hours ago              assisted-installer
      8b0cc08890b4  quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:7f44844c4024dfa35688eac52e5e3d1540311771c4a24fef1ba4a6dccecc0e55  start --node-name...  16 hours ago    Exited (0) 16 hours ago                hungry_varahamihira
      4916c14b9f7e  registry.redhat.io/rhel9/support-tools:latest                                                                           /usr/bin/bash         34 seconds ago  Up 34 seconds                          toolbox-core

       

      crio pods status:

      CONTAINER           IMAGE                                                                                                                    CREATED             STATE               NAME                 ATTEMPT             POD ID              POD
      03b89032db0bc       98fc664e8c2aa859c10ec8ea740b083c7c85925d75506bcb85c6c9c640945c36                                                         13 seconds ago      Exited              etcd                 182                 5d42cdad70890       etcd-bootstrap-member-<failed-master-name>.local
      01008c6e32e5a       quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:6b38d75b297fa52d1ba29af0715cec2430cd5fda1a608ed0841a09c55c292fb3   16 hours ago        Running             coredns              0                   5f8736b856a0c       coredns-<failed-master-name> 5e00e89ebef34       quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:2e119d0d9f8470dd634a62329d2670602c5f169d0d9bbe5ad25cee07e716c94b   16 hours ago        Exited              render-config        0                   5f8736b856a0c       coredns-<failed-master-name> f5098d5d27a39       quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:2e119d0d9f8470dd634a62329d2670602c5f169d0d9bbe5ad25cee07e716c94b   16 hours ago        Running             keepalived-monitor   0                   4fb91cefa8a9e       keepalived-<failed-master-name> a1e9d4c8cf477       quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:d24879d39e10fcf00a7c28ab23de1d6cf0c433a1234ff34880f12642b75d4512   16 hours ago        Running             keepalived           0                   4fb91cefa8a9e       keepalived-<failed-master-name> de21bc99f0d3f       quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:8c74c57f91f0f7ed26bb62f58c7b84c55750e51947fd6cc5711fa18f30b9f68c   16 hours ago        Running             etcdctl              0                   5d42cdad70890       etcd-bootstrap-member-<failed-master-name> 

      Attachments

        Activity

          People

            zabitter Zane Bitter
            rhn-support-malonso Maria Del Mar Alonso
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated: