Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-37470

Nodes are drained twice when an OCB image is applied

XMLWordPrintable

    • Moderate
    • None
    • MCO Sprint 257
    • 1
    • False
    • Hide

      None

      Show
      None
    • Hide
      * Previously, the same node was queued multiple times in the draining controller which, caused the the same node to be drained twice. With this release, the a node will only be drained once.
      __________________
      This was happening due to the same node being queued multiple times in the drain controller. This may have been due to increased activity on the node object by OCL functionality. By being more specific about the object diff before nodes are queued for drains, a node will be only queued for drain once.
      Show
      * Previously, the same node was queued multiple times in the draining controller which, caused the the same node to be drained twice. With this release, the a node will only be drained once. __________________ This was happening due to the same node being queued multiple times in the drain controller. This may have been due to increased activity on the node object by OCL functionality. By being more specific about the object diff before nodes are queued for drains, a node will be only queued for drain once.
    • Bug Fix
    • Done

      Description of problem:

      When a OCB is enabled, and a new MC is created, nodes are drained twice when the resulting osImage build is applied.
      
          

      Version-Release number of selected component (if applicable):

      4.16
          

      How reproducible:

      Always
          

      Steps to Reproduce:

          1. Enable OCB in the worker pool
      
      oc create -f - << EOF
      apiVersion: machineconfiguration.openshift.io/v1alpha1
      kind: MachineOSConfig
      metadata:
        name: worker
      spec:
        machineConfigPool:
          name: worker
        buildInputs:
          imageBuilder:
            imageBuilderType: PodImageBuilder
          baseImagePullSecret:
            name: $(oc get secret -n openshift-config pull-secret -o json | jq "del(.metadata.namespace, .metadata.creationTimestamp, .metadata.resourceVersion, .metadata.uid, .metadata.name)" | jq '.metadata.name="pull-copy"' | oc -n openshift-machine-config-operator create -f - &> /dev/null; echo -n "pull-copy")
          renderedImagePushSecret:
            name: $(oc get -n openshift-machine-config-operator sa builder -ojsonpath='{.secrets[0].name}')
          renderedImagePushspec: "image-registry.openshift-image-registry.svc:5000/openshift-machine-config-operator/ocb-image:latest"
      EOF
      
      
      
          2. Wait for the image to be built
      
          3. When the opt-in image has been finished and applied create a new MC
      
      apiVersion: machineconfiguration.openshift.io/v1
      kind: MachineConfig
      metadata:
        labels:
          machineconfiguration.openshift.io/role: worker
        name: test-machine-config-1
      spec:
        config:
          ignition:
            version: 3.1.0
          storage:
            files:
            - contents:
                source: data:text/plain;charset=utf-8;base64,dGVzdA==
              filesystem: root
              mode: 420
              path: /etc/test-file-1.test
      
          4. Wait for the image to be built
          

      Actual results:

      Once the image is built it is applied to the worker nodes.
      
      If we have a look at the drain operation, we can see that every worker node was drained twice instead of once:
      
      oc -n openshift-machine-config-operator logs $(oc -n openshift-machine-config-operator get pods -l k8s-app=machine-config-controller -o jsonpath='{.items[0].metadata.name}') -c machine-config-controller | grep "initiating drain"
      I0430 13:28:48.740300       1 drain_controller.go:182] node ip-10-0-70-208.us-east-2.compute.internal: initiating drain
      I0430 13:30:08.330051       1 drain_controller.go:182] node ip-10-0-70-208.us-east-2.compute.internal: initiating drain
      I0430 13:32:32.431789       1 drain_controller.go:182] node ip-10-0-69-154.us-east-2.compute.internal: initiating drain
      I0430 13:33:50.643544       1 drain_controller.go:182] node ip-10-0-69-154.us-east-2.compute.internal: initiating drain
      I0430 13:48:08.183488       1 drain_controller.go:182] node ip-10-0-70-208.us-east-2.compute.internal: initiating drain
      I0430 13:49:01.379416       1 drain_controller.go:182] node ip-10-0-70-208.us-east-2.compute.internal: initiating drain
      I0430 13:50:52.933337       1 drain_controller.go:182] node ip-10-0-69-154.us-east-2.compute.internal: initiating drain
      I0430 13:52:12.191203       1 drain_controller.go:182] node ip-10-0-69-154.us-east-2.compute.internal: initiating drain
      
      
          

      Expected results:

      Nodes should drained only once when applying a new MC
          

      Additional info:

          

              djoshy David Joshy
              sregidor@redhat.com Sergio Regidor de la Rosa
              Sergio Regidor de la Rosa Sergio Regidor de la Rosa
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

                Created:
                Updated:
                Resolved: