Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-61479

Agent stuck in Rebooting state for HCP on x86 while attaching IBMZ compute nodes

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Major Major
    • None
    • 4.20, 4.20.z
    • None
    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • Low
    • Yes
    • s390x
    • None
    • None
    • Rejected
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Description of problem:

          [root@bastion ~]# oc get agents -o wide -A
      NAMESPACE               NAME                                   CLUSTER          APPROVED   ROLE     STAGE       HOSTNAME            REQUESTED HOSTNAME
      hosted-cluster-agents   456cad23-f4ce-01e8-a594-880ae9a1b996   hosted-cluster   true       worker   Done        52-54-00-4f-91-33   compute-0.hosted-cluster.solntest.com
      hosted-cluster-agents   91aa8be2-f93b-4702-b0a8-cd78fd074996   hosted-cluster   true       worker   Rebooting   52-54-00-57-6e-c4   compute-1.hosted-cluster.solntest.com
      
      Agent compute-1 is not able to reboot after scaleup of the nodepool
      
      On the KVM host the console logs of compute-1 shows 'Connection refused' error

      Version-Release number of selected component (if applicable):

          [root@bastion ~]#    oc get mce
      NAME     STATUS        AGE   CURRENTVERSION   DESIREDVERSION
      engine   Progressing   24h   2.10.0-18        2.10.0-18
      
      [root@bastion ~]# oc version
      Client Version: 4.20.0-rc.0
      Kustomize Version: v5.6.0
      Server Version: 4.20.0-rc.0
      Kubernetes Version: v1.33.3
      
      https://mirror.openshift.com/pub/openshift-v4/s390x/dependencies/rhcos/pre-release/latest-4.20/rhcos-4.20.0-ec.6-s390x-live-iso.s390x.iso

      How reproducible:

          Most of the times

      Steps to Reproduce:

          1.Bring up OCP cluster
          2. Configure the OCP clsuter to add HCP compute nodes
          3. Add the KVM compute nodes as agents and scaleup
          

      Actual results:

          The KVM agent is stuck in 'Rebooting' state

      Expected results:

          KVM agent should join the cluster and state be 'Done'

      Additional info:

          KVM console logs
      
      [root@hcp-test ~]# virsh console hosted-cluster-agent-0
      setlocale: No such file or directory
      Connected to domain 'hosted-cluster-agent-0'
      [   ***] A start job is running for Ignition (fetch) (20h 1min 10s / no limit)
      [72075.610769] ignition[847]: GET https://api.hosted-cluster.solntest.com:31876/ignition: attempt #14403
      [72075.611127] ignition[847]: GET error: Get "https://api.hosted-cluster.solntest.com:31876/ignition": dial tcp: lookup api.hosted-cluster.solntest.com on [::1]:53: read udp [::1]:37932->[::1]:53: read: c[   ***] A start job is running for Ignition (fetch) (20h 1min 15s / no limit)
      [72080.621743] ignition[847]: GET https://api.hosted-cluster.solntest.com:31876/ignition: attempt #14404
      [72080.648685] ignition[847]: GET error: Get "https://api.hosted-cluster.solntest.com:31876/ignition": dial tcp: lookup api.hosted-cluster.solntest.com on [::1]:53: read udp [::1]:43987->[::1]:53: read: c[  *** ] A start job is running fo[**    ] A start job is running for Ignition (fetch) (20h 1min 20s / no limit)
      [72085.654880] ignition[847]: GET https://api.hosted-cluster.solntest.com:31876/ignition: attempt #14405
      [ ***  ] A start job is running for Ignition (fetch) (20h 1min 25s / no limit)
      [72090.652344] ignition[847]: GET https://api.hosted-cluster.solntest.com:31876/ignition: attempt #14406
      [     *] A start job is running for Ignition (fetch) (20h 1min 30s / no limit)
      [72095.653762] ignition[847]: GET https://api.hosted-cluster.solntest.com:31876/ignition: attempt #14407
      [ ***  ] A start job is running for Ignition (fetch) (20h 1min 35s / no limit)
      [72100.658514] ignition[847]: GET https://api.hosted-cluster.solntest.com:31876/ignition: attempt #14408
      

       

              vgodihal Vasant Godihalkar
              vgodihal Vasant Godihalkar
              None
              Damisetti Veerabhadra, Jibin Pattara, Shweta Muthukumar, Sumit Solanki
              None
              None
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

                Created:
                Updated: