Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-18310

[RHV] Scale up machine stuck in Provisioned status

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Cannot Reproduce
    • Icon: Critical Critical
    • None
    • 4.14
    • None
    • Critical
    • No
    • False
    • Hide

      None

      Show
      None

      Description of problem:

      Scale up machineset, machine stuck in Provisioned status, no csr pending. This cluster is a 4.12-4.14 OCP on RHV EUS upgrade cluster, after upgrade, run scale up.

      Version-Release number of selected component (if applicable):

      4.14.0-0.nightly-2023-08-24-211602

      How reproducible:

      Always

      Steps to Reproduce:

      1. Scale up machineset
      $ oc scale machineset ge3n2-d7jgn-worker --replicas=3
      2.
      3.
      

      Actual results:

      Machine stuck in Provisioned status
      $ oc get machine    
      NAME                        PHASE          TYPE   REGION   ZONE   AGE
      ge3n2-d7jgn-master-0        Running                               8d
      ge3n2-d7jgn-master-1        Running                               8d
      ge3n2-d7jgn-master-2        Running                               8d
      ge3n2-d7jgn-worker-cjbpf    Running                               8d
      ge3n2-d7jgn-worker-m488d    Running                               8d
      ge3n2-d7jgn-worker-s8jpn    Provisioned                           11h
      ge3n2-d7jgn-worker1-4dtp8   Provisioned                           15h
      ge3n2-d7jgn-worker1-xvnvc   Provisioning                          63m
      
      $ oc logs -f machine-api-controllers-865b6dbc7c-p2gxk -c machine-controller | grep ge3n2-d7jgn-worker-s8jpn
      I0829 23:24:36.342830       1 controller.go:175] ge3n2-d7jgn-worker-s8jpn: reconciling Machine
      I0829 23:24:36.342875       1 actuator.go:128] actuator "msg"="Checking machine ge3n2-d7jgn-worker-s8jpn exists."
      I0829 23:24:36.372881       1 retry.go:36] cached-client/actuator/ovirt "msg"="Getting vm name ge3n2-d7jgn-worker-s8jpn..."
      I0829 23:24:36.392755       1 retry.go:40] cached-client/actuator/ovirt "msg"="Completed getting vm name ge3n2-d7jgn-worker-s8jpn."
      I0829 23:24:36.392779       1 controller.go:319] ge3n2-d7jgn-worker-s8jpn: reconciling machine triggers idempotent update
      I0829 23:24:36.422207       1 retry.go:36] cached-client/actuator/ovirt "msg"="Getting vm name ge3n2-d7jgn-worker-s8jpn..."
      I0829 23:24:36.439990       1 retry.go:40] cached-client/actuator/ovirt "msg"="Completed getting vm name ge3n2-d7jgn-worker-s8jpn."
      I0829 23:24:36.464349       1 controller.go:347] ge3n2-d7jgn-worker-s8jpn: has no node yet, requeuing

      Expected results:

      Machine can be running.

      Additional info:

      Must-gather: https://drive.google.com/file/d/1b_IVvRmlvRqNJzUPj2GCrTEj1ZLR5HQ_/view?usp=sharing
      upgrade process: https://redhat-internal.slack.com/archives/CH76YSYSC/p1692924288625129

            rh-ee-mengel Michael Engel
            rhn-support-zhsun Zhaohua Sun
            Votes:
            0 Vote for this issue
            Watchers:
            11 Start watching this issue

              Created:
              Updated:
              Resolved: