Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-27789

Unable to scale up replicas on vSphere

XMLWordPrintable

    • Moderate
    • No
    • False
    • Hide

      None

      Show
      None

      Description of problem:

      Unable to scale up new replicas via the MCO on vSphere

      Version-Release number of selected component (if applicable):

      OCP 4.12.42 IPI
      vSphere 7 update 3o Build 22357613
      ESXi 7.0.3 22348816       

      How reproducible:

      Unsure    

      Steps to Reproduce:

          1.  Installed OCP 4.12 IPI on vSphere 
          2.  Scale up node count
          3.
          

      Actual results:

      # oc get machines -A
      NAMESPACE               NAME                        PHASE         TYPE   REGION   ZONE   AGE
      openshift-machine-api   ocp412-t4fdx-master-0       Running                              15d
      openshift-machine-api   ocp412-t4fdx-master-1       Running                              15d
      openshift-machine-api   ocp412-t4fdx-master-2       Running                              15d
      openshift-machine-api   ocp412-t4fdx-worker-56f8r   Provisioned                          78m
      openshift-machine-api   ocp412-t4fdx-worker-bmrnt   Provisioned                          78m
      openshift-machine-api   ocp412-t4fdx-worker-fjqhm   Running                              15d
      openshift-machine-api   ocp412-t4fdx-worker-lv4b5   Provisioned                          78m
      openshift-machine-api   ocp412-t4fdx-worker-p967q   Provisioned                          78m

      Expected results:

      Machine should scale up and report why it's failing

      Additional info:

      [root@ocp412-t4fdx-worker-p967q ~]# systemctl list-units --state=failed --all
      0 loaded units listed.
      To show all installed unit files use 'systemctl list-unit-files'.
      
      [root@ocp412-t4fdx-worker-p967q kubernetes]# systemctl |grep firstboot
      coreos-ignition-firstboot-complete.service                                                       loaded active     exited           CoreOS Mark Ignition Boot Complete                                         
      machine-config-daemon-firstboot.service                                                          loaded inactive   dead      start  Machine Config Daemon Firstboot 
      
      [root@ocp412-t4fdx-worker-p967q kubernetes]# systemctl status machine-config-daemon-firstboot.service
      ● machine-config-daemon-firstboot.service - Machine Config Daemon Firstboot
         Loaded: loaded (/etc/systemd/system/machine-config-daemon-firstboot.service; enabled; vendor preset: enabled)
         Active: inactive (dead)     
      
      [root@ocp412-t4fdx-worker-p967q kubernetes]# journalctl -u kubelet.service
      -- Logs begin at Tue 2024-01-23 16:27:55 UTC, end at Tue 2024-01-23 17:19:39 UTC. --
      -- No entries --                                      
      
      [root@ocp412-t4fdx-worker-p967q ~]# systemctl |grep dead
      crio.service                                                                                     loaded inactive   dead      start  Container Runtime Interface for OCI (CRI-O)
      iscsi.service                                                                                    loaded inactive   dead      reload Login and scanning of iSCSI devices
      kubelet-auto-node-size.service                                                                   loaded inactive   dead      start  Dynamically sets the system reserved for the kubelet
      kubelet.service                                                                                  loaded inactive   dead      start  Kubernetes Kubelet
      machine-config-daemon-firstboot.service                                                          loaded inactive   dead      start  Machine Config Daemon Firstboot
      machine-config-daemon-pull.service                                                               loaded inactive   dead      start  Machine Config Daemon Pull
      node-valid-hostname.service                                                                      loaded inactive   dead      start  Wait for a non-localhost hostname
      ovs-configuration.service                                                                        loaded inactive   dead      start  Configures OVS with proper host networking configuration
      rpc-statd.service                                                                                loaded inactive   dead      start  NFS status monitor for NFSv2/3 locking.
      systemd-update-utmp-runlevel.service                                                             loaded inactive   dead      start  Update UTMP about System Runlevel Changes
      graphical.target                                                                                 loaded inactive   dead      start  Graphical Interface
      multi-user.target                                                                                loaded inactive   dead      start  Multi-User System
      network-online.target                                                                            loaded inactive   dead      start  Network is Online

      Scaling down the machineset does remove the nodes that are stuck progressing, while scaling up again gets stuck in this state.

       

       

            rhn-engineering-skumari Sinny Kumari
            rh-ee-syangsao Sam Yangsao
            Sergio Regidor de la Rosa Sergio Regidor de la Rosa
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: