Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-8764

[IPI Baremetal] The host doesn't power off upon removal during scale down.

XMLWordPrintable

    • Moderate
    • 5
    • Metal Platform 235, Metal Platform 236, Metal Platform 237, Metal Platform 238, Metal Platform 239, Metal Platform 240, Metal Platform 241, Metal Platform 242, Metal Platform 243, Metal Platform 244, Metal Platform 245, Metal Platform 246, Metal Platform 247, Metal Platform 248, Metal Platform 249
    • 15
    • Unspecified
    • This enhancement ensures that hosts are powered off after being removed from the cluster. This benefits the Bare Metal Operator during hardware maintenance and management.
    • Enhancement
    • Done

      The host doesn't power off upon removal during scale down.

      Version: 4.4.0-0.nightly-2020-01-09-013524

      Steps to reproduce:

      Starting with 3 workers:
      [kni@worker-2 ~]$ oc get bmh -n openshift-machine-api
      NAME STATUS PROVISIONING STATUS CONSUMER BMC HARDWARE PROFILE ONLINE ERROR
      openshift-master-0 OK externally provisioned ocp-edge-cluster-master-0 ipmi://192.168.123.1:6230 true
      openshift-master-1 OK externally provisioned ocp-edge-cluster-master-1 ipmi://192.168.123.1:6231 true
      openshift-master-2 OK externally provisioned ocp-edge-cluster-master-2 ipmi://192.168.123.1:6232 true
      openshift-worker-0 OK provisioned ocp-edge-cluster-worker-0-d2fvm ipmi://192.168.123.1:6233 unknown true
      openshift-worker-5 OK provisioned ocp-edge-cluster-worker-0-ptklp ipmi://192.168.123.1:6245 unknown true
      openshift-worker-9 OK provisioned ocp-edge-cluster-worker-0-jb2tm ipmi://192.168.123.1:6239 unknown true

      [kni@worker-2 ~]$ oc get machine -n openshift-machine-api
      NAME PHASE TYPE REGION ZONE AGE
      ocp-edge-cluster-master-0 4d4h
      ocp-edge-cluster-master-1 4d4h
      ocp-edge-cluster-master-2 4d4h
      ocp-edge-cluster-worker-0-d2fvm 146m
      ocp-edge-cluster-worker-0-jb2tm 11m
      ocp-edge-cluster-worker-0-ptklp 3h54m

      [kni@worker-2 ~]$ oc get node
      NAME STATUS ROLES AGE VERSION
      master-0 Ready master 4d4h v0.0.0-master+$Format:%h$
      master-1 Ready master 4d4h v0.0.0-master+$Format:%h$
      master-2 Ready master 4d4h v0.0.0-master+$Format:%h$
      worker-0 Ready worker 18m v0.0.0-master+$Format:%h$
      worker-5 Ready worker 18m v0.0.0-master+$Format:%h$
      worker-9 Ready worker 5m2s v0.0.0-master+$Format:%h$

      adding annotation to mark the proper node for deletion:
      oc annotate machine ocp-edge-cluster-worker-0-jb2tm machine.openshift.io/cluster-api-delete-machine=yes -n openshift-machine-api
      machine.machine.openshift.io/ocp-edge-cluster-worker-0-jb2tm annotated

      Deleting the bmh:
      [kni@worker-2 ~]$ oc delete bmh openshift-worker-9 -n openshift-machine-api
      baremetalhost.metal3.io "openshift-worker-9" deleted

      Scaling down the replicas number:
      [kni@worker-2 ~]$ oc scale machineset -n openshift-machine-api ocp-edge-cluster-worker-0 --replicas=2
      machineset.machine.openshift.io/ocp-edge-cluster-worker-0 scaled

      The entry (worker-9) got removed as expected:
      [kni@worker-2 ~]$ oc get node
      NAME STATUS ROLES AGE VERSION
      master-0 Ready master 4d4h v0.0.0-master+$Format:%h$
      master-1 Ready master 4d4h v0.0.0-master+$Format:%h$
      master-2 Ready master 4d4h v0.0.0-master+$Format:%h$
      worker-0 Ready worker 28m v0.0.0-master+$Format:%h$
      worker-5 Ready worker 28m v0.0.0-master+$Format:%h$
      [kni@worker-2 ~]$ oc get machine -n openshift-machine-api
      NAME PHASE TYPE REGION ZONE AGE
      ocp-edge-cluster-master-0 4d4h
      ocp-edge-cluster-master-1 4d4h
      ocp-edge-cluster-master-2 4d4h
      ocp-edge-cluster-worker-0-d2fvm 156m
      ocp-edge-cluster-worker-0-ptklp 4h5m

      [kni@worker-2 ~]$ oc get bmh -n openshift-machine-api
      NAME STATUS PROVISIONING STATUS CONSUMER BMC HARDWARE PROFILE ONLINE ERROR
      openshift-master-0 OK externally provisioned ocp-edge-cluster-master-0 ipmi://192.168.123.1:6230 true
      openshift-master-1 OK externally provisioned ocp-edge-cluster-master-1 ipmi://192.168.123.1:6231 true
      openshift-master-2 OK externally provisioned ocp-edge-cluster-master-2 ipmi://192.168.123.1:6232 true
      openshift-worker-0 OK provisioned ocp-edge-cluster-worker-0-d2fvm ipmi://192.168.123.1:6233 unknown true
      openshift-worker-5 OK provisioned ocp-edge-cluster-worker-0-ptklp ipmi://192.168.123.1:6245 unknown true

      Yet, if I try to connect to the node that got deleted - it's still UP and running.

      Expected result:
      The removed node should have been powered off automatically.

            rhn-engineering-hpokorny Honza Pokorny
            achuzhoy@redhat.com Alexander Chuzhoy
            Jad Haj Yahya Jad Haj Yahya
            Red Hat Employee
            Votes:
            0 Vote for this issue
            Watchers:
            19 Start watching this issue

              Created:
              Updated:
              Resolved: